Hi John,

I tried to follow your description but failed to reproduce this issue.
Would you mind to provide some more details? Especially:


   Exact Git commit hash of the snapshot version you were using

   Mine: e0f946265b9ea5bc48849cf7794c2c03d5e29fba


   Compilation flags (Hadoop version, profiles enabled, etc.)


   ./sbt/sbt -Pyarn,kinesis-asl,hive,hadoop-2.3 -Dhadoop.version=2.3.0
clean assembly/assembly


   Also, it would be great if you can provide the schema of your table plus
   some sample data that can help reproduce this issue.


On Wed, Aug 20, 2014 at 6:11 AM, John Omernik <j...@omernik.com> wrote:

> I am working with Spark SQL and the Thrift server.  I ran into an
> interesting bug, and I am curious on what information/testing I can provide
> to help narrow things down.
> My setup is as follows:
> Hive 0.12 with a table that has lots of columns (50+) stored as rcfile.
> Spark-1.1.0-SNAPSHOT with Hive Built in (and Thrift Server)
> My query is only selecting one STRING column from the data, but only
> returning data based on other columns .
> Types:
> col1 = STRING
> col2 = STRING
> col3 = STRING
> col4 = Partition Field (TYPE STRING)
> Queries
> cache table table1;
> --Run some other queries on other data
> select col1 from table1
> where col2 = 'foo' and col3 = 'bar' and col4 = 'foobar' and col1 is not
> null limit 100
> Fairly simple query.
> When I run this in SQL Squirrel I get no results. When I remove the and
> col1 is not null I get 100 rows of <null>
> When I run this in beeline (the one that is in the spark-1.1.0-SNAPSHOT) I
> get no results and when I remove 'and col1 is not null' I gett 100 rows of
> <null>
> Note: Both of these are after I ran some other queries.. .i.e. on other
> columns, after I ran CACHE TABLE TABLE1 first before any queries. That
> seemed interesting to me...
> So I went to the spark-shell to determine if it was a spark issue, or a
> thrift issue.
> I ran:
> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
> import hiveContext._
> cacheTable("table1")
> Then I ran the same "other" queries" got results, and then I ran the query
> above, and I got results as expected.
> Interestingly enough, if I don't cache the table through cache table
> table1 in thrift, I get results for all queries. If I uncache, I start
> getting results again.
> I hope I was clear enough here, I am happy to help however I can.
> John

Reply via email to