[ 
https://issues.apache.org/jira/browse/PIG-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sarath updated PIG-4943:
------------------------
    Description: 
I have a script which stores 2 relations with different schema using 
CSVExcelStorage.

The issue which i see is that the script picks up the last store function and 
takes the schema in that and puts it for all store functions , overriding the 
previous store schemas.Is this a known issue and is there a fix for this ?

My Sample Script Looks like this :--

=============================================================

masterInput = load 'hbase://xyz' using 
org.apache.pig.backend.hadoop.hbase.HBaseStorage(
                    'f:a,f:b,f:c,f:d')
          as (a,b,c,d);

input2 = foreach masterInput
                  generate
                        a,b;

input3 = foreach masterInput
                  generate
                      c,d;

store input2 into '/dir/ab'
using org.apache.pig.piggybank.storage.CSVExcelStorage('\t','YES_MULTILINE', 
'UNIX', 'WRITE_OUTPUT_HEADER');

store input3 into '/dir/cd'
using org.apache.pig.piggybank.storage.CSVExcelStorage('\t','YES_MULTILINE', 
'UNIX', 'WRITE_OUTPUT_HEADER');

=============================================================
Where *a,b,c,d* are my headers in my source file

Expected                   Output :

||file 1||
|a|b
|10|20

||file 2||
|c|d
|30|40


Actual Output :

||file 1||
|c|d
|10|20

||file 2||
|c|d
|30|40

  was:
I have a script which stores 2 relations with different schema using 
CSVExcelStorage.

The issue which i see is that the script picks up the last store function and 
takes the schema in that and puts it for all store functions , overriding the 
previous store schemas.Is this a known issue and is there a fix for this ?

My Sample Script Looks like this :--

=============================================================

masterInput = load 'hbase://xyz' using 
org.apache.pig.backend.hadoop.hbase.HBaseStorage(
                    'f:a,f:b,f:c,f:d')
          as (a,b,c,d);

input2 = foreach masterInput
                  generate
                        a,b;

input3 = foreach masterInput
                  generate
                      c,d;

store input2 into '/dir/ab'
using org.apache.pig.piggybank.storage.CSVExcelStorage('\t','YES_MULTILINE', 
'UNIX', 'WRITE_OUTPUT_HEADER');

store input3 into '/dir/cd'
using org.apache.pig.piggybank.storage.CSVExcelStorage('\t','YES_MULTILINE', 
'UNIX', 'WRITE_OUTPUT_HEADER');

=============================================================

Expected                   Output :

||file 1||
|a|b
|10|20

||file 2||
|c|d
|30|40


Actual Output :

||file 1||
|c|d
|10|20

||file 2||
|c|d
|30|40


> Schema issue while storing multiple pig outputs using CSVExcelStorage
> ---------------------------------------------------------------------
>
>                 Key: PIG-4943
>                 URL: https://issues.apache.org/jira/browse/PIG-4943
>             Project: Pig
>          Issue Type: Bug
>          Components: piggybank
>    Affects Versions: 0.14.0
>            Reporter: sarath
>            Priority: Minor
>
> I have a script which stores 2 relations with different schema using 
> CSVExcelStorage.
> The issue which i see is that the script picks up the last store function and 
> takes the schema in that and puts it for all store functions , overriding the 
> previous store schemas.Is this a known issue and is there a fix for this ?
> My Sample Script Looks like this :--
> =============================================================
> masterInput = load 'hbase://xyz' using 
> org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>                     'f:a,f:b,f:c,f:d')
>           as (a,b,c,d);
> input2 = foreach masterInput
>                   generate
>                         a,b;
> input3 = foreach masterInput
>                   generate
>                       c,d;
> store input2 into '/dir/ab'
> using org.apache.pig.piggybank.storage.CSVExcelStorage('\t','YES_MULTILINE', 
> 'UNIX', 'WRITE_OUTPUT_HEADER');
> store input3 into '/dir/cd'
> using org.apache.pig.piggybank.storage.CSVExcelStorage('\t','YES_MULTILINE', 
> 'UNIX', 'WRITE_OUTPUT_HEADER');
> =============================================================
> Where *a,b,c,d* are my headers in my source file
> Expected                   Output :
> ||file 1||
> |a|b
> |10|20
> ||file 2||
> |c|d
> |30|40
> Actual Output :
> ||file 1||
> |c|d
> |10|20
> ||file 2||
> |c|d
> |30|40



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to