[
https://issues.apache.org/jira/browse/PIG-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
argho chatterjee updated PIG-4671:
----------------------------------
Priority: Critical (was: Major)
> Use MR Native Execution Optimally
> ---------------------------------
>
> Key: PIG-4671
> URL: https://issues.apache.org/jira/browse/PIG-4671
> Project: Pig
> Issue Type: Improvement
> Reporter: argho chatterjee
> Priority: Critical
>
> Hi Team,
> With respect to the feature development here
> https://issues.apache.org/jira/browse/PIG-506
> I tried the above approach, but PIG forces to store the data in one directory
> and then load the data from this directory.
> Eg :
> A = Load ....
> B = MAPREDUCE 'SomeJar.jar' Store A into input Load Output as ...
> Here we are loading and simply storing it back again for the Map-reduce Job
> To work.
> I do not think this is Optimized way because, suppose, I have implemented my
> Own Pig-Readers which is smart, and now I want to load Data Using this
> Pig-Reader into my Mapreduce job, then It will load using this reader and
> then Store it in some Directory from where the MR will take it as Input. (Lot
> of Unnecessary IO where I could directly fed the data From my Custom Pig
> Loader to my MapReduce Job)
> Can There be a way where the data loaded by A can be directly Fed to the
> Map-reduce Job.???!!!
> That is, If required, We shall Implement some Readers Of Pig In our MR jobs
> and then Use it to Read Data into MR.
> We have implemented some Smart Pig Readers , I want to use them in my map
> reduce and not use the native MR readers.
> Please have a look at this case scenario.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)