Read the documentation of the string and vector classes. Demonstrate that C is 
a TC language and then leave me alone.
I NEVER said that everything is to be done only with <,> pairs. but watch out on the specification: the <,> is a relative delimiter, especially when you have something like the [,] pairs coming up, different priorities. If am honestly starting to get annoyed by this. I am fed up with people who try to reinvent the wheel for no reason. Use xerces XML parser or one of the tiny xml parsers out there.

I am fed up. It was not a parser I was after about, but rather to convert the entire book into a living meta - program. You did not get it, you do it as you wish. And by the way not what you wrote about delimiters. Doctype has [,] too. If you have <!ENTITY statements you have them within [,] pairs... when in DTD. your parser does not count that priorities SHIFT from <,> pairs to [,] pairs.. <!-- and --> is a matter easy to solve as all the rest when you take this under consideration (shifting priorities among delimiters, and when should that happen).


FOR ONCE AGAIN: character by character parsing is not the single way to go. 
Find a bash snapshot i posted some while ago, port it to C and you are all done.
FOR ONCE AGAIN: READ THE STRING AND VECTOR CLASS documentation.

once again:

Fed up, no thanks.

I hope that the project maintainter / leader / god / daemon whatever starts 
working on janitor skills.


Fed up, premature constructive criticism is what makes opensource go backwards. 
 Do something about it.


Bruce Dubbs wrote:
Jean Charles Passard wrote:
I'm truying to go deeper in Xml analysing but I'm really annoying by
what I read in specification w3c.
Especially about your point 7.
   I have noted this delimiters :
      1. < >
      2. <!-- -->
      3. <? ?>
      4. <![CDATA[  ]]>
      5. <!DOCTYPE  >
      6. <! >

   They all give problems if I try to parse only on <> :
      1. it's ok ;)
      2. can have < and/or > inside
      3. it's ok too.
      4. can have < and/or > inside
      5. can have <!-- --> <! > and []
      6. it's ok

   I can't see what idea can make a good parse whitout doing it char by
char.

Of course you have to parse the input character by character.  The way
to do this is with a state machine.  When you get a '<' character you go
into an intermediate state.  You then read the next character to decide
what state to go into next.

I wrote a program once to count lines of C/C++ code that is not unlike
this problem.  When you take the issue of comments in C/C++ as well as
directives and tokens, its quite similar to the XML problem.  I am
attaching the code as an example.

You can build the program with a simple: gcc -o count count-methods.c
To test, use: ./count -m count-meth*{c,h}.
The code is reasonably well commented.  :)

I also wrote a more sophisticated program to count the number of comment
words, variables, etc and the frequency of use, but I can't find it
right now.

  -- Bruce


------------------------------------------------------------------------

/******************************************************
* FILENAME: count-methods.c
* File for Program 3A from A Discipline for * Software Engineering
*
*   Author: B. Dubbs
*   Start Date: 18 May 1998 (coding)
*
* EXPORTED FUNCTIONS: None
*
* Notes:
*   Using PSP0.1
Write a program to count the logical lines of code (LOC), by
    methods/functions, in a set of programs.

    Note: A lot of design/code can be reused from Program 2A,
    Count LOC in program source files.

    Main program design:

    1. Declare and initialize global statistics
        Statistics include:
           Total LOC
           Total Blank Lines
           Total Comment Lines
           Total LOC with comments
           Total Functions

    2. Input

    Get the list of programs to count lines of code.  This
input will be from the command line. The program will not handle wildcards.

    If an error is encountered (no files specified, output an
    appropriate error message.

    3. Loop for each file input, count lines of code, by function
       and output results.

    a. Open the file.  If the file cannot be opened, output an error
       message and restart the loop.

    b. Initialize statistic counters.  The states the system can be
       in are: inCode, inCommentC, inCommentCpp.  The initial state
       is inCode.  The initial previous character is NUL.

       To determine when a function starts, we keep the most recent
       alphanumeric word.  If the word is immediately followed by a '('
(after optional whitespace) and the line did not start with a '#' (a macro), the function has started. We must also check to
       see if a '{' occurs after he function declaration.  If not,
       the function is declared, but not implemented.

       A function level is incremented when a '{' is found and
       decremented when a '}' is found.  When the level is decremented
       to zero, the function ends.

       Initialize functionLevel and nameString.

    c. For each character in the file:
          when asterisk
             if previous character is a '/' and mode != inCommentCpp then
                set mode = inCommentC

          when slash
             if previous character is a '/' and mode == inCode then
                set mode = inCommentCpp
             else if previous character is an '*' and mode == inCommentC
                set mode = inCode

          when newline of end-of-file
             update file counts (line-has-comment, line-has-code)
             if function state i sstart, add line to pending count
             if print flag set or function-state is MIDDLE,
                update function counts
             reset code, comment, print, and macro flags

          when '{'
             if inCode
                if function-state = START
                   set function-state = MIDDLE
                   add pending lines of code
                if function-state = MIDDLE, increment nesting

          when '}'
             if inCode and function-state == MIDDLE,
                decrement nesting
                if nesting == 0
                    increment function count
                    set print flag
                    set function-state = NONE

          when '('
             if inCode and not inMacro and function-state == NONE
                set function-state = START
                copy current-name to function-name
                clear function counts

          when ';'
             if function-state == START and inCode
                set function-state = NONE

          when alphanumeric or '_'
             if inCode
                if function-state == NONE
                   if previous character is not alphanumeric
                      clear current name
                   append character to current name
                set line-has-code flag
             set line-has-comment flag

          when other
             do nothing

          save current character as previous character

    d. Close the input file.
    e. Print the statistics for the file.
    f. Accumulate global statistics.

    4. Print the total statistics for all files.
    5. Exit
***************

    Supporting Functions

    Clear file counts
      Set all static file counters to zero

    Clear function counts
      Set all static function counters to zero

    Clear Total counts
      Set totals to zero

    Print Function Counts
    Print file Counts
       Print statistics for above

    Increment Pending LOC
    Add Pending LOC to Function totals

    Update file counts (line-has-comment, line-has-code)
    Update function counts (line-has-comment, line-has-code)
    (These are essentially the same but work but work on file and
     function counts respectively -- only one function updates
     the previous-line-blank flag)

       if line-has-comment and line-has-code
          increment code-with-comment counter
          increment code counter
          clear previous-line-blank flag
       else if line-has-comment
          increment comment counter
          clear previous-line-blank flag
       else if line-has-code
          increment-code-counter
          clear previous-line-blank flag
else if previous-line was not blank
             increment blank-line counter
          set previous-line-blank flag
       endif
**********************************************************************/

#include <unistd.h>
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include "count-methods.h"

char* argv0;
int main(int argc, char* argv[])
{
    unsigned int i;
    int          methods = 0;
    char         filename[256];
    int          ch;
    extern char* optarg;
    int          skip = 0;
/**********************************************************
*    1. Initialize global statistics
*/

     ClearTotalCounts();
     *filename = 0;
     argv0 = argv[0];

/**********************************************************
*    2. Input
*
*    Get the list of programs to count lines of code by function.
*    This input will be from the command line.  The program will not
*    handle wildcards.
*
*    If an error is encountered (no files specified), output an
*    appropriate error message.
*/

     if (argc < 2) usage();

/*   2a.  Get arguments */

while ((ch = getopt(argc, argv, "f:m")) != -1) {
        switch (ch)
        {
            case 'm':
                methods = 1;
                skip++;
                break;

            case 'f':
                strcpy(filename, optarg);  // Note: no range check
                skip +=2;
                break;

            default:
                usage();
                break;
        }
     }

//printf("argc=%i, methods=%i, filename=%s \n", argc, methods, filename);
//return;
/**********************************************************
*    3. Loop for each file input, count the lines of code, by function
*       and output results.
*
*    a. Open the file. If the file cannot be opened, output an error
*       message and restart the loop.
*
*    b. Initialize statistic counters.  The state the system can be
*       in are: InCode, inCommentC, inCommentCpp.  The initail state
*       is inCode.  The initial previous character is NULL.
*/

     if ( filename != NULL )
     {
         FILE* list;
         FILE* currentFile;
         char  file[256];

         list = fopen(filename, "r");
         if ( list == NULL )
         {
           fprintf(stderr, "Could not open file list %s.\n", filename);
         }
         else
         {
             while ( fgets(file, 255, list) != NULL )
{ if (file[strlen(file)-1] == '\n') file[strlen(file)-1] = '\0';
                 if ( strlen(file) == 0 ) continue;
currentFile = fopen(file, "r");
                 if (currentFile == NULL)
                 {
                    fprintf(stderr, "Could not open file %s.\n", file);
                    continue;
                 }
                 ProcessFile(currentFile, methods, file);
             }
         }
     }

     for (i=1+skip; i<argc; i++)
     {
         FILE* currentFile;

         currentFile = fopen(argv[i], "r");
         if (currentFile == NULL)
         {
             fprintf(stderr, "Could not open file %s.\n", argv[1]);
             continue;
      }

ProcessFile(currentFile, methods, argv[i]); } PrintTotalCounts();
     return 0;
}


void ProcessFile(FILE* currentFile, int methods, char* filename)
{
         int previousChar;
         int lineHasCode;
         int lineHasComment;
         int inMacro;
         int printFlag;
         int nesting;

         char functionName[256];
         char currentName[256];

         enum CODE_STATUS mode;
         enum FUNCTION_STATUS functionState;

         ClearFileCounts();

         previousChar = 0;
         nesting      = 0;

         lineHasCode    = FALSE;
         lineHasComment = FALSE;
         inMacro        = FALSE;
         printFlag      = FALSE;
         mode           = IN_CODE;
         functionState  = NONE;
         currentName[0] = '\0';

/**********************************************************
*    3. For each character in the file:
*          when asterisk
*             if previous character is a '/' and mode != inCommentCpp then
*                set mode = inCommentC
*
*          when slash
*             if previous character is  a'/' and mode == inCode then
*                set mode = inCommentCpp
*             else if previous character is an '*' and mode == inCommentC
*                set mode = inCode
*/
while (TRUE)
         {
             int currentChar;

             currentChar = fgetc(currentFile);
#ifdef DEBUG
             if (currentChar == EOF) printf("\nEOF\n");
             else putchar(currentChar);
#endif
             switch (currentChar)
             {
                 case '*':
                     if (previousChar == '/' && mode != IN_COMMENT_CPP)
                     {
                         mode = IN_COMMENT_C;
                     }
                     break;

                 case '/':
                     if (previousChar == '/' && mode == IN_CODE)
                     {
                         mode = IN_COMMENT_CPP;
                     }
                     else if (previousChar == '*' && mode == IN_COMMENT_C)
                     {
                         mode = IN_CODE;
                     }
                     break;

/*         when newline of end-of-file
 *            if code and function only started, count line as pending
 *            update counts (line-has-comment, line-has-code)
 *               need to take care of last line of function when not MIDDLE
 *               so we use the print flag
 *            if print flag is set, print function counts
 *            reset code, comment, print, and macro flags
 */
                 case EOF:
                 case '\n':
                     if (previousChar=='\n'  &&  currentChar == EOF)
                     {
                         break;
                     }
if (lineHasCode && functionState == START)
                     {
                         IncrementPending();
                     }
if (functionState == MIDDLE || printFlag)
                     {
                         UpdateFunctionCounts(lineHasComment, lineHasCode);
                     }
UpdateFileCounts(lineHasComment, lineHasCode); if (printFlag && methods == 1)
                     {
                         PrintFunctionCounts(functionName);
                     }

                     if (mode == IN_COMMENT_CPP)
                     {
                         mode = IN_CODE;
                     }

#ifdef DEBUG
                     printf("lineHasCode = %i ", lineHasCode);
                     printf("lineHasComment = %i ", lineHasComment);
                     printf("printFlag = %i ", printFlag);
                     printf("mode=%i ", mode);
                     printf("fctnState=%i ", functionState);
                     printf("inMacro=%i ", inMacro);
                     printf("nesting=%i\n", nesting);
#endif

                     lineHasCode    = FALSE;
                     lineHasComment = FALSE;
                     inMacro        = FALSE;
                     printFlag      = FALSE;
                     break;

/*         when '{'
 *           if inCode
 *             if function-state == START, set function-state = MIDDLE
 *             if function-state == MIDDLE, increment nesting
 *
 *         when '}'
 *           if inCode and function-state == MIDDLE,
 *             decrement nesting
 *             if nesting == 0
 *               set print flag
 *               set function-state = NONE
 *               increment function count
 *
 *         when '('
 *           if inCode and not inMacro and function-state == NONE,
 *             set function-state = START
 *             copy current-name to function-name
 *             clear function counts
 */

                 case '{':
#ifdef DEBUG
                     printf("\n{  mode=%i, functionState=%i\n", mode, 
functionState);
#endif
                     if (mode == IN_CODE)
                     {
                         if (functionState == START) /* Now we really have a 
function */
                         {
                             functionState = MIDDLE;
                             AddPending();
                         }
                         if (functionState == MIDDLE)
                         {
                             nesting++;
                         }
                     }
                     break;
case '}':
#ifdef DEBUG
                     printf("\n}  mode=%i, functionState=%i\n", mode, 
functionState);
#endif
                     if (mode == IN_CODE  &&  functionState == MIDDLE)
                     {
                         nesting--;
                         if (nesting == 0)  /* End of function found */
                         {
                             IncrementFunctionCount();
                             printFlag = TRUE;  /* Print at end of line */
                             functionState = NONE;
                         }
                     }
                     break;

                 case '(':    /* looking for start of function */
#ifdef DEBUG
                     printf("\n(  mode=%i, functionState=%i, inMacro=%i\n", 
mode, functionState, inMacro);
#endif
                     if (mode == IN_CODE  && functionState == NONE  &&  
!inMacro)
                     {
                         functionState = START;
                         strcpy(functionName, currentName);
                         ClearFunctionCounts();
#ifdef DEBUG
                     printf("\n(  functionName=%s, functionState=%i\n", 
functionName, functionState);
#endif
                     }
                     break;
/* when ';'
 *           if function-state == START and inCode
 *              set function-state = NONE
 *
 *         when '#'
 *           if inCode and line-has-code is false,
 *              set inMacroFlag
 */

                 case ';':
                     if (functionState == START  &&  mode == IN_CODE)
                     {
                         functionState = NONE;  /* Only a declaration */
                     }
                     break;

                 case '#':
                     if (mode == IN_CODE  && !lineHasCode)
                     {
                         inMacro = TRUE;
                     }
                     break;

                 default:
                     break;
             } /* switch */

             if (currentChar == EOF)
             {
                 break;
             }

/*         when alphanumeric or '_'
 *           if inCode
 *             if function-state == NONE
 *               if previous character is not alphanumeric
 *                 clear current name
 *               append character to current name
 *             set line-has-code flag
 *           else
 *             set line-has-comment flag
 *
 *         when other
 *           do nothing
 */

             if (isalnum((char)currentChar)  ||  currentChar == '_')
             {
                 if (mode == IN_CODE)
                 {
                     lineHasCode = TRUE;
                     if (functionState == NONE) /* Looking for function name */
                     {
                         int length;

                         if (!isalnum((char)previousChar))
                         {
                             currentName[0] = 0;
                         }
                         length = strlen(currentName);
                         currentName[length] = (char)currentChar;
                         currentName[length+1] = 0;  /* Note: no bounds 
check!!! */
                     }
                 }
                 else
                 {
                     lineHasComment = TRUE;
                 }
             }

/* save current character as previous character */
             previousChar = currentChar;
         }  /* While char not EOF */

/***************************************************************
 *   d. Close the input file
 *   e. Print the statistics for the file.
 *   f. Accumulate global statistics
 *
 *   4. Print the total statistics for all files.
 */

         fclose(currentFile);
         PrintFileCounts(filename);
         UpdateTotalCounts();
}






/*******************************************************************
 *  Helper functions to manage total counts
 *
 *  Counts are local to file
 *
 */

static unsigned int totalCode;
static unsigned int totalComments;
static unsigned int totalBlank;
static unsigned int totalCommentsWithCode;
static unsigned int totalFunctions;
static unsigned int totalFiles;
static unsigned int headerPrinted;
;
static unsigned int fileCodeLines;
static unsigned int fileCommentLines;
static unsigned int fileBlankLines;
static unsigned int fileCommentsWithCode;
static unsigned int previousLineBlank;

void ClearTotalCounts(void)
{
    totalCode             = 0;
    totalComments         = 0;
    totalBlank            = 0;
    totalCommentsWithCode = 0;
    totalFunctions        = 0;
    totalFiles            = 0;
    headerPrinted         = FALSE;
}

void UpdateTotalCounts(void)
{
    totalCode             += fileCodeLines;
    totalComments         += fileCommentLines;
    totalBlank            += fileBlankLines;
    totalCommentsWithCode += fileCommentsWithCode;
}

void PrintTotalCounts(void)
{
    PrintHeader();
printf("\n%13u%12u%14u%19u Total\n", totalCode, totalBlank, totalComments, totalCommentsWithCode);
    printf("\n\nTotal Functions %u\n", totalFunctions);
}

/*************************************************************************
 * Helper functions to manage file counts
*/

void IncrementFileCount(void)
{
    totalFiles++;
}

void ClearFileCounts(void)
{
    fileCodeLines        = 0;
    fileCommentLines     = 0;
    fileBlankLines       = 0;
    fileCommentsWithCode = 0;
    previousLineBlank    = FALSE;
}

void UpdateFileCounts(int lineHasComment, int lineHasCode)
{
    if (lineHasComment && lineHasCode)
    {
        fileCommentsWithCode++;
        fileCodeLines++;
        previousLineBlank = FALSE;
    }
    else if (lineHasComment)
    {
        fileCommentLines++;
        previousLineBlank = FALSE;
    }
    else if (lineHasCode)
    {
        fileCodeLines++;
        previousLineBlank = FALSE;
    }
    else
    {
        if (previousLineBlank == FALSE)
        {
            fileBlankLines++;
        }
        previousLineBlank = TRUE;
    }
#ifdef DEBUG
    printf("fileCodeLines=%i\n", fileCodeLines);
#endif
}

void PrintHeader(void)
{
    if (!headerPrinted)
    {
        printf("Lines_of_code Blank_Lines Comment_Lines "
               "Comments_with_Code   Func name/Filename\n");
        headerPrinted = TRUE;
    }
}

void PrintFileCounts(char* filename)
{
    PrintHeader();
printf("%13u%12u%14u%19u %s\n", fileCodeLines, fileBlankLines, fileCommentLines,
            fileCommentsWithCode, filename);
}

/*************************************************************************
 * Helper functions to manage function counts
*/

static unsigned int functionCodeLines;
static unsigned int functionCommentLines;
static unsigned int functionBlankLines;
static unsigned int functionCommentsWithCode;
static unsigned int functionCodePending;

void IncrementPending(void)
{
    functionCodePending++;
}

void AddPending(void)
{
    functionCodeLines += functionCodePending;
}

void IncrementFunctionCount(void)
{
    totalFunctions++;
}

void ClearFunctionCounts(void)
{
    functionCodeLines        = 0;
    functionCommentLines     = 0;
    functionBlankLines       = 0;
    functionCommentsWithCode = 0;
    functionCodePending      = 0;
}

/* Must be called before UpdateFileCounts to take care of
 * previous blank line update.
 */

void UpdateFunctionCounts(int lineHasComment, int lineHasCode)
{
    if (lineHasComment && lineHasCode)
    {
        functionCommentsWithCode++;
        functionCodeLines++;
    }
    else if (lineHasComment)
    {
        functionCommentLines++;
    }
    else if (lineHasCode)
    {
        functionCodeLines++;
    }
    else
    {
        if (previousLineBlank == FALSE)
        {
            functionBlankLines++;
        }
    }
#ifdef DEBUG
    printf("functionCodeLines=%i\n", functionCodeLines);
#endif
}

void PrintFunctionCounts(char* functionName)
{
    PrintHeader();
    printf("%13u%12u%14u%19u   %s\n",
            functionCodeLines, functionBlankLines, functionCommentLines,
            functionCommentsWithCode, functionName);
}

void usage()
{
    (void) fprintf(stderr, "usage: %s [-f file] [-m] [file1 ... ]\n", argv0);
    (void) fprintf(stderr, "  m         : print method/function counts\n");
    (void) fprintf(stderr, "  f filename: get files to count from filename\n");
    (void) fprintf(stderr, "  a file or list of files must be specified\n");
exit(1);
}


------------------------------------------------------------------------

/******************************************************
 * FILENAME: count-methods.h
* Header file for Program 3A from A Discipline for * Software Engineering
 *
 *   Author: B. Dubbs
 *   Start Date: 20 May 1998
 *
 * EXPORTED FUNCTIONS: None
 *
 * Notes:
 *
 * *****************************************************/

#ifndef _count_methods
#define _count_methods

#define TRUE  1
#define FALSE 0

enum CODE_STATUS
{   IN_CODE,
    IN_COMMENT_C,
    IN_COMMENT_CPP
};

enum FUNCTION_STATUS
{   START,
    MIDDLE,
    NONE
};

void ClearTotalCounts(void);
void UpdateTotalCounts(void);
void PrintTotalCounts(void);

void ClearFileCounts(void);
void UpdateFileCounts(int lineHasCode, int lineHasComments);
void PrintFileCounts(char* filename);

void ClearFunctionCounts(void);
void UpdateFunctionCounts(int lineHasCode, int lineHasComments);
void PrintFunctionCounts(char* functionName);
void IncrementPending(void);
void AddPending(void);

void IncrementFunctionCount(void);
void IncrementFileCount(void);

void ProcessFile(FILE* currentFile, int methods, char* file);
void PrintHeader(void);

void usage(void);
#endif

w

--
http://linuxfromscratch.org/mailman/listinfo/alfs-discuss
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page

Reply via email to