[ 
https://issues.apache.org/jira/browse/PIG-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793672#action_12793672
 ] 

Gerrit Jansen van Vuuren commented on PIG-1117:
-----------------------------------------------

Hi, 
Thanks for the review, I've refactored the code abit to include the changes 
that you've mentioned. 
I've decided to convert Byte and Boolean to Int when found. although where I 
work we haven't used Byte or Boolean yet.

I'll try to get the diff for the version up tonight.

With the hive dependencies I've made some changes to the build.xml to actually 
download the hive tar from there website (only if the hive libs are not already 
downloaded so this will only happen once).
This should allows to just type ant hive-jar and the deps are downloaded 
automatically.

> Pig reading hive columnar rc tables
> -----------------------------------
>
>                 Key: PIG-1117
>                 URL: https://issues.apache.org/jira/browse/PIG-1117
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Gerrit Jansen van Vuuren
>         Attachments: HiveColumnarLoader.patch, HiveColumnarLoaderTest.patch, 
> PIG-1117.patch
>
>
> I've coded a LoadFunc implementation that can read from Hive Columnar RC 
> tables, this is needed for a project that I'm working on because all our data 
> is stored using the Hive thrift serialized Columnar RC format. I have looked 
> at the piggy bank but did not find any implementation that could do this. 
> We've been running it on our cluster for the last week and have worked out 
> most bugs.
>  
> There are still some improvements to be done but I would need  like setting 
> the amount of mappers based on date partitioning. Its been optimized so as to 
> read only specific columns and can churn through a data set almost 8 times 
> faster with this improvement because not all column data is read.
> I would like to contribute the class to the piggybank can you guide me in 
> what I need to do?
> I've used hive specific classes to implement this, is it possible to add this 
> to the piggy bank build ivy for automatic download of the dependencies?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to