[
https://issues.apache.org/jira/browse/PIG-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
liyunzhang_intel updated PIG-4239:
----------------------------------
Status: Patch Available (was: Open)
add following code in StoreConverter.java in PIG-4239.patch
{code}
if ("true".equalsIgnoreCase(storeJobConf
.get(PigConfiguration.PIG_OUTPUT_LAZY))) {
Job storeJob = new Job(storeJobConf);
LazyOutputFormat.setOutputFormatClass(storeJob,
PigOutputFormat.class);
storeJobConf = (JobConf) storeJob.getConfiguration();
storeJobConf.setOutputKeyClass(Text.class);
storeJobConf.setOutputValueClass(Tuple.class);
storeJobConf.set("mapred.output.dir", poStore.getSFile()
.getFileName());
pairRDDFunctions.saveAsNewAPIHadoopDataset(storeJobConf);
{code}
> "pig.output.lazy" not works in spark mode
> -----------------------------------------
>
> Key: PIG-4239
> URL: https://issues.apache.org/jira/browse/PIG-4239
> Project: Pig
> Issue Type: Bug
> Components: spark
> Reporter: liyunzhang_intel
> Assignee: liyunzhang_intel
> Attachments: PIG-4239.patch, lazy, lazy.pig
>
>
> If "pig.output.lazy" is "true", empty part files from the output will be
> omitted.
> steps to reproduce
> 1. set "pig.output.lazy" as “true" in $PIG_HOME/conf/pig.properties.
> 2. run following lazy.pig script in spark mode:
> cat lazy.pig
> a = load '/user/pig/lazy' using PigStorage();
> b = filter a by $0 == 'hey';
> c = store b into '/tmp/lazy.out';
> lazy.pig and lazy are attached
> 3. empty file "/tmp/lazy.out/part-rxxxx” is still generated, the empty file
> is expected not to be generated when "pig.output.lazy" is "true"
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)