Josh Rosen created SPARK-8868:
---------------------------------

             Summary: SqlSerializer2 can go into infinite loop when row 
consists only of NullType columns
                 Key: SPARK-8868
                 URL: https://issues.apache.org/jira/browse/SPARK-8868
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.4.0, 1.5.0
            Reporter: Josh Rosen
            Priority: Minor


The following SQL query will cause an infinite loop in SqlSerializer2 code:

{code}
val df = sqlContext.sql("select null where 1 = 1")
df.unionAll(df).sort("_c0").collect()
{code}

The same problem occurs if we add more null-literals, but does not occur as 
long as there is a column of any other type (e.g. {{select 1, null where 1 == 
1}}).

I think that what's happening here is that if you have a row that consists only 
of columns of NullType (not columns of other types which happen to only contain 
null values, but only columns of null literals), SqlSerializer will end up 
writing / reading no data for rows of this type.  Since the deserialization 
stream will never try to read any data but nevertheless will be able to return 
an empty row, DeserializationStream.asIterator will go into an infinite loop 
since there will never be a read to trigger an EOF exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to