Hi,

Has anyone seen the following?

I am getting an error when running ORDER:
   ERROR 1071: Cannot convert a Unknown to a String

The error occurs in DataType.java:885. At the end of that switch
statement variable 'type' is -1, and variable 'o' is a string that looks
like a leftover from the prior statements A or B. The value of 'o' is:

%!PS-Adobe-2.0
%%Creator: dvips(k) 5.86 Copyright 1999 Radical Eye Software
%%Title: arXiv:astro-ph/0005123 v3   2 Oct 2000
%%Pages: 7
%%PageOrder: As...

Note that if I skip the ORDER statement, everything works, and looks
correct in the resulting file. Random order, of course.

The error does not occur if I make one simple change:
Store D into a tmp file, then LOAD that file and execute E without
any change to that statement.

Pseudocode below, followed by the stack trace.

A    = LOAD 'foo' "
       USING aLoader()
       AS (url:chararray,
              date:chararray,
              pageSize:int,
              position:int,
              docidInCrawl:int,
              httpHeader:chararray,
              content:chararray);

B    = FOREACH A GENERATE
       udf();

-- B is of the form {(chararray,chararray,int), (chararray,chararray,int),
... }

D = FOREACH B GENERATE flatten($0) AS (token:chararray, docID:chararray,
tokenPos:int);
E = ORDER D BY token ASC;
STORE E INTO 'bar';


org.apache.pig.backend.executionengine.ExecException: ERROR 1071: Cannot
convert a Unknown to a String
    at org.apache.pig.data.DataType.toString(DataType.java:885)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:642)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:367)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
    at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)

Reply via email to