[ https://issues.apache.org/jira/browse/PHOENIX-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812736#comment-15812736 ]
Gabriel Reid commented on PHOENIX-3538: --------------------------------------- Sorry for such a slow reply on this [~kalyanhadoop]. There seems to be an issue with your patch file. When I try to import it, I get the following error: {code} $ patch -p1 < PHOENIX-3538-final.patch patching file phoenix-core/src/it/java/org/apache/phoenix/end2end/RegexBulkLoadToolIT.java patching file phoenix-core/src/main/java/org/apache/phoenix/mapreduce/RegexBulkLoadTool.java patching file phoenix-core/src/main/java/org/apache/phoenix/mapreduce/RegexToKeyValueMapper.java patching file phoenix-core/src/main/java/org/apache/phoenix/util/regex/ObjectToArrayConverter.java patching file phoenix-core/src/main/java/org/apache/phoenix/util/regex/RegexUpsertExecutor.java patch unexpectedly ends in middle of line patch: **** malformed patch at line 835: {code} In any case, I took a look through the patch file, and it looks quite good. Just a few minor comments on things I noticed though: * it looks like there is quite a bit of copy/paste of the existing JSON and/or CSV code in the RegexUpsertExecutor and ObjectToArrayConverter. There are also some references to JSON within these regex classes. At a minimum, the references to JSON in error/exeption messages should be cleaned up, and even better would be if the code duplication there was improved so that we've only got one copy of the same logic. * While it's good that there is javadoc in RegexBulkLoadTool and RegexToKeyValueMapper classes, it looks to me like it was pretty much copied from the JSON variant of those classes. I think that there should be some clearer explanation of what the classes do, i.e. how the regex-extracted values need to match up with the columns being upserted, and how non-matching lines are handled. Right now the javadoc says that it takes "REGEX input lines", which is pretty confusing and not really what is being done by the class. > Regex Bulkload Tool > ------------------- > > Key: PHOENIX-3538 > URL: https://issues.apache.org/jira/browse/PHOENIX-3538 > Project: Phoenix > Issue Type: New Feature > Reporter: Kalyan > Assignee: Kalyan > Priority: Minor > Attachments: PHOENIX-3538-final.patch, PHOENIX-3538-v1.patch, > PHOENIX-3538.patch > > > To work with complex data , we can regex to load directly. > Similar to JSON Bulkload Tool & CSV Bulkload Tool -- This message was sent by Atlassian JIRA (v6.3.4#6332)