[ https://issues.apache.org/jira/browse/HADOOP-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ari Rabkin updated HADOOP-5087: ------------------------------- Attachment: fixedregex.patch This patch removes trailing spaces from adaptor parameters; I'm pretty sure this is the Right Thing. > Regex for Cmd parsing contains an error > --------------------------------------- > > Key: HADOOP-5087 > URL: https://issues.apache.org/jira/browse/HADOOP-5087 > Project: Hadoop Core > Issue Type: Bug > Components: contrib/chukwa > Environment: HADOOP-4947 use regex to parse chukwa commands but > there's an error in the regex > the current regex is: > Pattern addCmdPattern = > Pattern.compile("[aA][dD][dD]\\s+(\\S+)\\s+(\\S+)\\s+(.*\\S)?\\s*(\\d+)\\s*"); > does not correctly parsed this valid checkpoint entry: > "ADD > org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8NewLineEscaped > Syslog 0 /var/log/messages 114027" > Parsing result: > adaptorName > org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8NewLineEscaped > dataType Syslog > params 0 /var/log/messages 11402 > offset 7 > Instead of: > adaptorName > org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8NewLineEscaped > dataType Syslog > params 0 /var/log/messages > offset 114027 > The correct regex is: > "[aA][dD][dD]\\s+(\\S+)\\s+(\\S+)\\s+(.*\\s)?\\s*(\\d+)\\s*" > Example of parsing: "ADD > org.apache.hadoop.chukwa.datacollection.adaptor.MySpecificAdaptor Syslog 0 my > param1 param2 /var/log/messages 114027"; > Parsing result: > adaptorName org.apache.hadoop.chukwa.datacollection.adaptor.MySpecificAdaptor > dataType Syslog > params 0 my param1 param2 /var/log/messages > offset 114027 > Reporter: Jerome Boulon > Assignee: Jerome Boulon > Attachments: fixedregex.patch, HADOOP-5087-2.patch, HADOOP-5087.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.