HI Anil, Thanks a lot for your response and look like I am indeed looking for incremental queries. so if I have a thread that polls every second to get the latest updates I just have to change partition values to minimize the scans right?
Also I guess I can build some notification mechanism in case if my older partitions have an update. Thanks! On Thu, Nov 9, 2017 at 11:58 AM, AnilKumar B <[email protected]> wrote: > Hi Kant, > > If I understand your questions properly, you are looking for incremental > queries. > > Drill supports predicates pushed down with most of the Data sources. In > your case, suppose you are generating hourly partitions in HDFS using Spark > aplication. Then Drill is optmized to scan specific partition based on > query predicates(by using partition pruning) like for example > https://issues.apache.org/jira/browse/DRILL-3121. > > But Drill will not manage any checkpointing. So If BI/Dashboards tools like > Tableau etc can support this checkpointing then it's possible to connect > with Drill incrementally. > > Coming to latest Kafka storage plugin, In first version we are targetting > to support batch, I mean, at query time it will fetch all the messages from > start to end offsets for each topic partition and processes the data. > Currently it will support JSON and in next version we are targetting for > Avro support with schema registry. We are also discussing on fiseability > for metioning start and end offsset ranges, so that we can acheive > incremental support by managing checkpoining externally. > > Thanks, > B Anil Kumar. > > Thanks & Regards, > B Anil Kumar. > > On Thu, Nov 9, 2017 at 11:14 AM, kant kodali <[email protected]> wrote: > > > Can someone elaborate on what happens underneath if I poll every second > > (Specifically related to my questions in my previous email)? > > > > Thanks! > > > > On Thu, Nov 9, 2017 at 7:56 AM, Ted Dunning <[email protected]> > wrote: > > > > > Confluent has a non-Apache product, I think, for streaming SQL. > > > > > > > > > On Thu, Nov 9, 2017 at 4:50 PM, Saurabh Mahapatra <[email protected] > > > > > wrote: > > > > > > > Isn't there the new Kafka plugin? What does that exactly do? > > > > > > > > Best, > > > > Saurabh > > > > > > > > Sent from my iPhone > > > > > > > > > > > > > > > > > On Nov 9, 2017, at 5:15 AM, kant kodali <[email protected]> > wrote: > > > > > > > > > > Hi Tug, > > > > > > > > > > It's Parquet data on HDFS and the data to HDFS is constantly > written > > by > > > > > spark while consuming from Kafka. > > > > > > > > > > Is polling a common technique for say real time analytics > dashboard ? > > > > More > > > > > importantly if I poll does Drill due the scan every time? if the > > answer > > > > is > > > > > no, how does it know which is the new data? since the data is > written > > > > HDFS > > > > > constantly as a stream (The query can be the same however the new > > data > > > > will > > > > > be appended or updated to HDFS in parquet format as a stream). > > > > > > > > > > Thanks! > > > > > > > > > >> On Thu, Nov 9, 2017 at 4:47 AM, Tugdual Grall <[email protected]> > > > > wrote: > > > > >> > > > > >> Hello, > > > > >> > > > > >> > > > > >> Today Drill cannot do continuous/streaming query, so as you > > mentioned > > > > you > > > > >> will have to use a polling technique. > > > > >> > > > > >> > > > > >> Just out of curiosity, Which data source are you planning to use ? > > > > >> > > > > >> Regards > > > > >> Tug > > > > >> > > > > >> > > > > >> > > > > >> > > > > >>> On Thu 9 Nov 2017 at 04:31, kant kodali <[email protected]> > > wrote: > > > > >>> > > > > >>> Hi All, > > > > >>> > > > > >>> I am new to Apache Drill. I am wondering if Apache Drill can > > perform > > > > >>> Streaming Queries? For example, I have a constant stream of data > in > > > 24 > > > > >> hour > > > > >>> period and I would like to get updates as soon as I receive them. > > > > >>> > > > > >>> Do I need to have a polling thread that issues a Drill query > every > > > > >> second? > > > > >>> > > > > >>> Thanks! > > > > >>> > > > > >> > > > > > > > > > >
