[jira] [Commented] (CSV-70) Improve readability of CSVLexer
[ https://issues.apache.org/jira/browse/CSV-70?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238182#comment-13238182 ] Benedikt Ritter commented on CSV-70: That would mean to introduce a new Token type. How would you handle comment tokens? Just by reading everything after the comment start till line break? That could work. I guess we should discuss that with the ML. Improve readability of CSVLexer --- Key: CSV-70 URL: https://issues.apache.org/jira/browse/CSV-70 Project: Commons CSV Issue Type: Improvement Components: Parser Affects Versions: 1.0 Reporter: Benedikt Ritter Fix For: 1.0 There are several things that can be improved in the token lexer (this has also been discussed on ML, see http://markmail.org/thread/c6x5ji4v44nx5k4h): * Remove Token input parameter in nextToken() * Add convenience methods isDelimiter(c) and isEncapsulator(c) * Remove current caracter input parameter from methods * If possible: replace while(true) loops -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CSV-70) Improve readability of CSVLexer
[ https://issues.apache.org/jira/browse/CSV-70?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237807#comment-13237807 ] Benedikt Ritter commented on CSV-70: I have looked at that several times, but I don't know how to remove the recursive call. First I thought one could just change: {code:java} if (isCommentStart(c)) { // ignore everything till end of line and continue (incr linecount) in.readLine(); tkn = nextToken(tkn.reset()); } {code} to: {code:java} if (isCommentStart(c)) { // ignore everything till end of line and continue (incr linecount) in.readLine(); tkn.reset(); continue; } {code} But that will skip all the empty line processing... Maybe it will be easier to remove that recursive call once we have split up the parsing logic some more. Improve readability of CSVLexer --- Key: CSV-70 URL: https://issues.apache.org/jira/browse/CSV-70 Project: Commons CSV Issue Type: Improvement Components: Parser Affects Versions: 1.0 Reporter: Benedikt Ritter Fix For: 1.0 There are several things that can be improved in the token lexer (this has also been discussed on ML, see http://markmail.org/thread/c6x5ji4v44nx5k4h): * Remove Token input parameter in nextToken() * Add convenience methods isDelimiter(c) and isEncapsulator(c) * Remove current caracter input parameter from methods * If possible: replace while(true) loops -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CSV-70) Improve readability of CSVLexer
[ https://issues.apache.org/jira/browse/CSV-70?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238039#comment-13238039 ] Sebb commented on CSV-70: - Yes, I tried something similar and it broke the tests. I think it would be useful to provide (optional) access to the comment fields so one way to solve this would be to return the comment as a token, and deal will it at the record level. Improve readability of CSVLexer --- Key: CSV-70 URL: https://issues.apache.org/jira/browse/CSV-70 Project: Commons CSV Issue Type: Improvement Components: Parser Affects Versions: 1.0 Reporter: Benedikt Ritter Fix For: 1.0 There are several things that can be improved in the token lexer (this has also been discussed on ML, see http://markmail.org/thread/c6x5ji4v44nx5k4h): * Remove Token input parameter in nextToken() * Add convenience methods isDelimiter(c) and isEncapsulator(c) * Remove current caracter input parameter from methods * If possible: replace while(true) loops -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CSV-70) Improve readability of CSVLexer
[ https://issues.apache.org/jira/browse/CSV-70?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236687#comment-13236687 ] Sebb commented on CSV-70: - Another tricky aspect of the lexer is that currently it does a recursive call for handling comments. Could this lead to stack overflow if a file contained lots of comments in sequence? It would be better to eliminate the recursive call. Improve readability of CSVLexer --- Key: CSV-70 URL: https://issues.apache.org/jira/browse/CSV-70 Project: Commons CSV Issue Type: Improvement Components: Parser Affects Versions: 1.0 Reporter: Benedikt Ritter Fix For: 1.0 There are several things that can be improved in the token lexer (this has also been discussed on ML, see http://markmail.org/thread/c6x5ji4v44nx5k4h): * Remove Token input parameter in nextToken() * Add convenience methods isDelimiter(c) and isEncapsulator(c) * Remove current caracter input parameter from methods * If possible: replace while(true) loops -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira