> On Oct. 10, 2013, 8:50 p.m., Julien Le Dem wrote:
> > trunk/src/org/apache/pig/impl/util/Utils.java, line 329
> > <https://reviews.apache.org/r/14552/diff/3/?file=362607#file362607line329>
> >
> > else throw exception?
> > we have a default if it's not set but we don't do anything if it's an
> > invalid compression codec
>
> Aniket Mokashi wrote:
> Idea is to let user provide a codec through
> "mapred.output.compression.codec" property. In case SequenceFile (Hadoop)
> starts supporting more than gz, snappy, lzo and bzip2 in newer version of
> hadoop. Note, when using seqfile, user can completely avoid setting
> pig.tmpfilecompression.codec and use mapred.output.compression.codec if
> wanted.
>
> We need this because TFileStorage uses pig's implementations of gz and
> lzo instead of hadoop. Also, gzip codec is called gz by pig and gzip by
> hadoop (making hard to be consistent and backwards compatible).
>
> Julien Le Dem wrote:
> sure. In particular my goal was to avoid doing nothing silently when
> there is a typo in the setting. Those are hard to debug. In the current impl
> not setting pig.tmpfilecompression.codec will force the DefaultCodec and
> setting to something we don't understand will leave the codec unchanged.
> hardly intuitive.
> In general setting a property to something unknown should not fall back
> to a default but fail with an explicit message.
Actually, thinking about it now, I think your comments make sense. I will
modify the code accordingly. Thanks a lot! :)
So, new behavior:
codec.equals("") -> user mean to set it through the mapred property.
else -> misspelling etc.
This is definitely more intuitive.
> On Oct. 10, 2013, 8:50 p.m., Julien Le Dem wrote:
> > trunk/src/org/apache/pig/impl/util/Utils.java, line 259
> > <https://reviews.apache.org/r/14552/diff/3/?file=362607#file362607line259>
> >
> > We should throw an exception if the trimmed lowercased string is
> > something we don't expect.
> > otherwise typos (seqFile, sequencefile, ...) will lead to unexpected
> > behavior.
> >
> > the call becomes then:
> > ...getProperty(...PIG_TEMP_FILE_COMPRESSION_STORAGE, "tfile");
> >
> > example message:
> > PigConfiguration.PIG_TEMP_FILE_COMPRESSION_STORAGE +": expected storage
> > seqfile or tfile (default tfile)"
> >
> >
>
> Aniket Mokashi wrote:
> We need seqfile as an option. tfile can be used by default without any
> extra properties flag.
>
> Julien Le Dem wrote:
> my point was that we should use the default when the property is not
> defined not when it is set to a value we don't understand. In that case it
> should fail.
got it. So, either property should not exist or property should be set to
tfile, (Instead of checking for "" string), correct?
Totally makes sense.
- Aniket
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14552/#review26894
-----------------------------------------------------------
On Oct. 9, 2013, 6 p.m., Aniket Mokashi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14552/
> -----------------------------------------------------------
>
> (Updated Oct. 9, 2013, 6 p.m.)
>
>
> Review request for pig, Cheolsoo Park, Dmitriy Ryaboy, Julien Le Dem, and
> Rohini Palaniswamy.
>
>
> Bugs: PIG-3480
> https://issues.apache.org/jira/browse/PIG-3480
>
>
> Repository: pig
>
>
> Description
> -------
>
> - Added a new parameter to make SequenceFileInterStorage optional.
> - Added tests
> - Refactored apis
>
>
> Diffs
> -----
>
> trunk/conf/pig.properties 1530468
> trunk/src/org/apache/pig/PigConfiguration.java 1530468
>
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
> 1530468
>
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/partitioners/WeightedRangePartitioner.java
> 1530468
>
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartitionRearrange.java
> 1530468
>
> trunk/src/org/apache/pig/backend/hadoop/executionengine/util/MapRedUtil.java
> 1530468
> trunk/src/org/apache/pig/impl/io/InterStorage.java 1530468
> trunk/src/org/apache/pig/impl/io/SequenceFileInterStorage.java PRE-CREATION
> trunk/src/org/apache/pig/impl/io/TFileStorage.java 1530468
> trunk/src/org/apache/pig/impl/util/Utils.java 1530468
> trunk/test/org/apache/pig/test/TestTmpFileCompression.java 1530468
>
> Diff: https://reviews.apache.org/r/14552/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Aniket Mokashi
>
>