Hi, as a way to learn J, I'm working on a sample project, implementing a C/C++ preprocessor in J. Being not very familiar with J libraries, I tend to implement everything in J directly; fortunately the task doesn't seem to be too large.
However, this area seem to be somewhat complicated for parallel approach which is exhibited by array-oriented language. The input is read left to right, and the processing of subsequent parts depends on results of processing of previous parts. One of preprocessor parts ought to be the extraction of preprocessing tokens - which may be header-names, identifiers, pp-numbers, char-constants, string-literals, punctuators or non-white-spaces. Suppose the program reads a character, determines the type of the pp-token, and then it needs to complete reading that token. This is a rather typical task. How to read a string literal, after initial " was read? It might be possible - assuming the whole file is in memory - to do this, thinking in vectors, but that will probably be too slow; a file may be huge, comparing to the length of the literal. So, a char-by-char solution is sought. It may be done using ^:_ construct, with properly chosen arguments. The problems are associated with escape sequences - like \a or \t, which need to be replaced by corresponding characters, or \\, \", \q, which need to be replaced with \, ", q correspondingly. x u ^:_ y means x u x u x u ... x u y, with iterations going on until result stops changing. If the input file, as a string, is y, then the x u y has to be another string, a different one, otherwise iterations will stop. It's not clear how to handle escape sequences - we have to also maintain the state of processing, something like, at least, a flag which says the \ was read. At this point the approach to have x as the input file and y as a vector of integers, describing the position in the input and escape state, may be tried. Also, the input may suddenly end, which is a syntax error, but still has to be handled gracefully. Sequential machine also doesn't seem to be quite a tool. First, it's not clear how to construct the input for that; it's either a manual effort, which seem to be hard to debug, or a program, which converts some other notation to the proper argument. Second, the sudden EOF issue is still not addressed. What does it mean? Just a task, which is not very good for J? Or some libraries are practically required for tasks like this? Or I'm missing some solution? Alexander ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm