Actually I figured out the issue. There were fields with null in my json and those fields were being serialized to "org.json.JSONObject.Null" objects and hence PIG was not able to map it to any valid type.
-Rakesh > From: [email protected] > To: [email protected] > Subject: RE: Unexpected data type -1 found in stream. > Date: Thu, 21 Oct 2010 11:27:24 -0700 > > > I am using Pig 0.7. No Luck even after removing explicit cast. > > PIG is not able to determine the type of the elements of the map and failing. > I am able to DUMP A and B in isolation. It's the union that's not working. > > DESCRIBE U results in: > > {x: map[ ],item: chararray} > > -Rakesh > > > From: [email protected] > > To: [email protected]; [email protected] > > Date: Thu, 21 Oct 2010 14:19:36 +0530 > > Subject: Re: Unexpected data type -1 found in stream. > > > > Hi Rakesh, > > > > There was some known concern with explicit cast not working when data is > > complex type (eg: bags). Check PIG-616. It is marked resolved now. > > As a confirmatory step, you can try removing the explicit cast of chararray > > and check? > > > > Thanks & Regards, > > /Rekha. > > > > On 10/21/10 11:58 AM, "rakesh kothari" <[email protected]> wrote: > > > > > My PIG script that is roughly like this: > > A = LOAD input1 USING JsonLoader AS (x:map[]); > B = LOAD input2 USING JsonLoader AS (x:map[]); > > A = FOREACH A GENERATE x, (chararray) x#'item' AS item:chararray; > B = FOREACH B GENERATE x, (chararray) x#'item' AS item:chararray; > > U = UNION A, B; > > DUMP U; > > > This leads to the following exception: > > java.lang.RuntimeException: Unexpected data type -1 found in stream. > at > org.apache.pig.data.DataReaderWriter.writeDatum(DataReaderWriter.java:306) > at > org.apache.pig.data.DataReaderWriter.writeDatum(DataReaderWriter.java:220) > at org.apache.pig.data.DefaultTuple.write(DefaultTuple.java:269) > at > org.apache.pig.impl.io.BinStorageRecordWriter.write(BinStorageRecordWriter.java:69) > at org.apache.pig.builtin.BinStorage.putNext(BinStorage.java:102) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:498) > at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:234) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:227) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:52) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) > > Any ideas ? > > I am able to dump A and B. > > -Rakesh >
