Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The "Pig070IncompatibleChanges" page has been changed by OlgaN.
http://wiki.apache.org/pig/Pig070IncompatibleChanges?action=diff&rev1=1&rev2=2

--------------------------------------------------

  = Backward incompatible changes in Pig 0.7.0 =
  
- Pig 0.7.0 will include some major changes to Pig most of them driven by the 
[[LoadStoreRedesignProposal | Load-Store redesign]]. Some of this changes will 
not be backward compatible and will require users to change the pig scripts or 
their UDFs. This document is intended to keep track of this changes to that we 
can document them for the release.
+ Pig 0.7.0 will include some major changes to Pig most of them driven by the 
[[LoadStoreRedesignProposal | Load-Store redesign]]. Some of these changes will 
not be backward compatible and will require users to change their pig scripts 
or their UDFs. This document is intended to keep track of such changes so that 
we can document them for the release.
  
- == Changes to the Load and Store functions ==
+ == Changes to the Load and Store Functions ==
  == Handling Compressed Data ==
+ 
+ In 0.6.0 or earlier versions Pig supported bzip compressed files with 
extensions of .bz or .bz2 as well as gzip compressed files with .gz extension. 
Pig was able to both read and write files in this format with the understanding 
that gzip compressed files could not be split across multiple maps while bzip 
compressed files could. Also, data compression was completely decoupled from 
the data format and Load/Store functions meaning that any loader could read 
compressed data and any store function could write it just by the virtue of 
having the right extension on the files it was reading or writing.
+ 
+ With Pig 0.7.0 the read/write functionality is taking over by Hadoop's 
Input/OutputFormat and how compression is handled or whether it is handled at 
all depends on the Input/OutputFormat used by the loader/store function.
+ 
+ The main input format that supports compression is TextInputFormat. It 
supports bzip files with .bz2 extension and gzip files with .gz extension. 
'''Note that it does not support .bz files'''. PigStorage is the only loader 
that comes with Pig that is derived from TextInputFormat which means it will be 
able to handle .bz2 and .gz files. Other laders such as BinStorage will no 
longer support compression.
+ 
+ On the store side, TextOutputFormat also supports compression but the store 
function needs do to additional work to enable it. Again, PigStorage will 
support compressions while other functions will not.
+ 
+ If you have a custom load/store function that needs to support compression, 
you would need to make sure that the underlying Input/OutputFormat supports 
this type of compression.
+ 
  == Local Mode ==
  == Streaming ==
  == Other Changes ==

Reply via email to