, Below is the query, looks like from physical plan, the query is same as that of cqlsh,
val query = s"""(select * from model_data where TimeStamp > \'$timeStamp+0000\' and TimeStamp <= \'$startTS+0000\' and MetricID = $metricID)""" println("Model query" + query) val df = spark.read .format(Config.dbDriver) .options(Map("Keyspace" -> Config.dbName, "table" -> "ml_forecast_tbl")) .load val d = spark.sparkContext.cassandraTable("analytics_db", "ml_forecast_tbl") .where(" \"TimeStamp\" > ? and \"TimeStamp\" <= ? and \"MetricID\" = 1", timeStamp + "+0000", startTS + "+0000") d.createorReplaceTempView("mode_data") val modelDF = spark.sql(query).cache. println(spark.sql(query).queryExecution) == Physical Plan == InMemoryTableScan [MetricID#9045, TimeStamp#9046, ResourceID#9047, Forecast#9048, GlobalThresholdMax#9049, GlobalThresholdMin#9050, Hi85#9051, Hi99#9052, Low85#9053, Low99#9054] : +- InMemoryRelation [MetricID#9045, TimeStamp#9046, ResourceID#9047, Forecast#9048, GlobalThresholdMax#9049, GlobalThresholdMin#9050, Hi85#9051, Hi99#9052, Low85#9053, Low99#9054], true, 10000, StorageLevel(disk, memory, deserialized, 1 replicas) : : +- *Filter ((((cast(TimeStamp#9046 as string) > 2016-01-22 00:00:00+0000) && (cast(TimeStamp#9046 as string) <= 2016-01-22 00:30:00+0000)) && isnotnull(TimeStamp#9046)) && isnotnull(MetricID#9045)) : : +- *Scan org.apache.spark.sql.cassandra.CassandraSourceRelation@40dc2ade [MetricID#9045,TimeStamp#9046,ResourceID#9047,Forecast#9048,GlobalThresholdMax#9049,GlobalThresholdMin#9050,Hi85#9051,Hi99#9052,Low85#9053,Low99#9054] PushedFilters: [IsNotNull(TimeStamp), IsNotNull(MetricID), EqualTo(MetricID,1)], ReadSchema: struct<MetricID:int,TimeStamp:timestamp,ResourceID:string,Forecast:double,GlobalThresholdMax:doub... On Tue, Jun 20, 2017 at 5:13 PM, Riccardo Ferrari <ferra...@gmail.com> wrote: > Hi, > > Personally I would inspect how dates are managed. How does your spark code > looks like? What does the explain say. Does TimeStamp gets parsed the same > way? > > Best, > > On Tue, Jun 20, 2017 at 12:52 PM, sujeet jog <sujeet....@gmail.com> wrote: > >> Hello, >> >> I have a table as below >> >> CREATE TABLE analytics_db.ml_forecast_tbl ( >> "MetricID" int, >> "TimeStamp" timestamp, >> "ResourceID" timeuuid >> "Value" double, >> PRIMARY KEY ("MetricID", "TimeStamp", "ResourceID") >> ) >> >> select * from ml_forecast_tbl where "MetricID" = 1 and "TimeStamp" > >> '2016-01-22 00:00:00+0000' and "TimeStamp" <= '2016-01-22 00:30:00+0000' ; >> >> MetricID | TimeStamp | ResourceID >> | Value| >> ----------+---------------------------------+--------------- >> -----------------------+----------+ >> | 1 | 2016-01-22 00:30:00.000000+0000 | >> 4a925190-3b13-11e7-83c6-a32261219d03 | 32.16177 | >> | 23.74124 | 15.2371 >> 1 | 2016-01-22 00:30:00.000000+0000 | >> 4a92c6c0-3b13-11e7-83c6-a32261219d03 | 32.16177 | >> | 23.74124 | 15.2371 >> 1 | 2016-01-22 00:30:00.000000+0000 | >> 4a936300-3b13-11e7-83c6-a32261219d03 | 32.16177 | >> | 23.74124 | 15.2371 >> 1 | 2016-01-22 00:30:00.000000+0000 | >> 4a93d830-3b13-11e7-83c6-a32261219d03 | 32.16177 | >> | 23.74124 | 15.2371 >> >> This query runs perfectly fine from cqlsh, but not with Spark SQL, it >> just emits empty results, >> Is there a catch to think about on querying timestamp ranges with >> cassandra spark connector >> >> Any inputs on this ?.. >> >> >> Thanks, >> Sujeet >> > >