[ 
https://issues.apache.org/jira/browse/HCATALOG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209700#comment-13209700
 ] 

Alan Gates commented on HCATALOG-259:
-------------------------------------

There are a couple of workarounds for this:

1) Leave these broken and tell users that if they want to write the data back 
out they need to switch this to a DefaultHCatRecord.  A method could be 
provided to do this easily.  This would save memory for the 90% of times when 
users are going to create new records anyway.

2) Go back to having a list of objects in the records, as Sushanth originally 
implemented it.  This would be more user friendly, but would mean every 
LazyHCatRecord required 8 more bytes and every get() would sustain an if (to 
determine whether to use the list or the serialized object).  The cost of the 
if would be negligible since it would always be true or false for all accesses 
of a tuple in a given operator.

I think I vote for 1, but I'm open to feedback.  Thoughts?
                
> Make readFields() and write() in LazyHCatRecord work
> ----------------------------------------------------
>
>                 Key: HCATALOG-259
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-259
>             Project: HCatalog
>          Issue Type: Sub-task
>            Reporter: Sushanth Sowmyan
>            Assignee: Alan Gates
>             Fix For: 0.4
>
>
> With the recent changes in HCATALOG-241, write() and readFields() will no 
> longer work.  This will cause a problem for MR users who want to filter out 
> records in their map job and then write the records back out to go to the 
> reduce, or to be saved on disk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to