Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The "LoadStoreRedesignProposal" page has been changed by PradeepKamath.
http://wiki.apache.org/pig/LoadStoreRedesignProposal?action=diff&rev1=17&rev2=18

--------------------------------------------------

  === Notes on implementation details ===
  This section is to document changes made at a high level to give an overall 
connected picture which code comments may not provide. 
  
- ==== Changes to work with Hadoop !InputFormat model ====
+ ==== Changes to work with Hadoop InputFormat model ====
+ Hadoop has the notion of a single InputFormat per job. This is restrictive 
since Pig processes multiple inputs in the same map reduce job (in the case of 
Join, Union or Cogroup). This is handled by !PigInputFormat which is the 
!InputFormat Pig communicates to Hadoop as the Job's !InputFormat.
  
  ==== Changes to work with Hadoop !OutputFormat model ====
  
@@ -530, +531 @@

   * fix lineage code to use !LoadCaster instead of !LoadFunc
   * local mode needs to be ported
   * !PigDump needs to be ported
-  * !POLoad needs to be ported
+  * POLoad needs to be ported
+  * Need to handle passing loadfunc specific info between different instances 
of loadfunc (Different instances in front end and between front end and back 
end - we need what is required in PIG-602) (setPartitionFilter() and 
pushOperators()for example needs 
-  * Need to handle passing loadfunc specific info between different instances 
of loadfunc (Different instances in front end and 
- between front end and back end - we need what is required in PIG-602) 
(setPartitionFilter() and pushOperators()for example needs 
  this - these methods are called in the front end but the information passed 
is needed in the backend)
+  * For !ResourceSchema to be effectively used for communicating schema, we 
must fix the two level access issues with schema of bags in current schema 
before we make these changes, otherwise that same contagion will afflict us 
here. 
-  * For !ResourceSchema to be effectively used for communicating schema, we 
must fix the two level access issues with 
- schema of bags in current schema before we make these changes, otherwise that 
same contagion will afflict us here. 
   * Input/Output handler code in streaming needs to be ported 
   * split by file will have to removed from language
   * fix code with FIXME in comment relating to load-store redesign
   * Decide on what we should do with !ReversibleLoadFunc and multiquery 
optimization
- 
- 
  
  == Changes ==
  Sept 23 2009, Gates

Reply via email to