[ 
https://issues.apache.org/jira/browse/PHOENIX-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814199#comment-15814199
 ] 

Gabriel Reid commented on PHOENIX-3538:
---------------------------------------

Thanks for the update [~kalyanhadoop]. However, the updates to the javadoc that 
you did weren't quite what I was going for -- it still doesn't really explain 
what the relationship is between a the regular expression and the upserting of 
data. 

The removal of the copy-paste code is good, but could you move the 
json-specific class that is being used for both the regex loader and json 
loader into a more generically-named package and class? 

It would also be good to have some tests in the integration test code for 
handling input lines for which the input data doesn't match the regex. I think 
that ideally we want to also check that a MapReduce counter is incremented to 
reflect this kind of situation.

Your patch file also still seems to be somewhat corrupted (I'm getting the same 
error when trying to import it). I'm able to fix it manually, but it would be 
good if it were to just work.

As one last thing, could you also add a documentation patch for 
http://phoenix.apache.org/bulk_dataload.html. Instructions for updating the 
website are here: http://phoenix.apache.org/building_website.html


> Regex Bulkload Tool
> -------------------
>
>                 Key: PHOENIX-3538
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3538
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Kalyan
>            Assignee: Kalyan
>            Priority: Minor
>         Attachments: PHOENIX-3538-codecleanup.patch, 
> PHOENIX-3538-final.patch, PHOENIX-3538-v1.patch, PHOENIX-3538.patch
>
>
> To work with complex data , we can regex to load directly.
> Similar to JSON Bulkload Tool & CSV Bulkload Tool



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to