[jira] Created: (PIG-96) It should be possible to spill big databags to HDFS

Pi Song (JIRA) Wed, 06 Feb 2008 03:59:30 -0800

It should be possible to spill big databags to HDFS
---------------------------------------------------


                 Key: PIG-96
                 URL: https://issues.apache.org/jira/browse/PIG-96
             Project: Pig
          Issue Type: Improvement
          Components: data
            Reporter: Pi Song


Currently databags only get spilled to local disk which costs  2  disk io 
operations.If databags are too big, this is not efficient. 
We should take advantage of HDFS so if the databag is too big (determined by 
DataBag.getMemorySize() >  a big  threshold), let's spill it to HDFS. Also read 
from HDFS in parallel when data is required.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (PIG-96) It should be possible to spill big databags to HDFS

Reply via email to