To update the documentation, since that issues were solved by using these properties in connection URL: defaultRowFetchSize=10000 [1] defaultAutoCommit=false [2] The full URL was there "url": "jdbc:postgresql:// myhost.mydomain.com/mydb?useCursorFetch=true&defaultAutoCommit=false&loggerLevel=TRACE&loggerFile=/tmp/jdbc.log&defaultRowFetchSize=10000 "
If some issues are still present, it is also reasonable to create tickets to track them. [1] https://mail-archives.apache.org/mod_mbox/drill-user/201808.mbox/%3CCADN0Fn9066hwvu_ZyDJ24tkAoJH5hqXoysCv83z7DdSSfjr-CQ%40mail.gmail.com%3E [2] https://mail-archives.apache.org/mod_mbox/drill-user/201808.mbox/%3C0d36e0e6e8dc1e77bbb67bbfde5f5296e290c075.camel%40omnicell.com%3E On Sat, Oct 13, 2018 at 3:56 PM Rahul Raj <[email protected]> wrote: > Should I create tickets to track these issues or should I create a ticket > to update the documentation? > > Rahul > > On Sat, Oct 13, 2018 at 6:16 PM Vitalii Diravka <[email protected]> > wrote: > > > 1. You are right, it means it is reasonable to extend this rule for > > applying on other Scan operators (or possibly to create the separate > one). > > 2. There was a question about OOM issues in Drill + PostgreSQL, please > take > > a look [1]. > > Since you are trying to setup this configs, It will be good, if you > > create a Jira ticket to add this info to Drill docs [2] > > > > [1] > > https://mail-archives.apache.org/mod_mbox/drill-user/201808.mbox/browser > > [2] https://drill.apache.org/docs/rdbms-storage-plugin/ > > > > On Sat, Oct 13, 2018 at 2:21 PM Rahul Raj <[email protected]> wrote: > > > > > Regarding the heap out of error, it could be that the jdbc driver is > > > prefetching the entire record set to memory. I just had a look at > > > JdbcRecordReader, looks like by setting connection#autoCommit(false) > and > > a > > > sufficient fetch size we could force the driver to stream data as > > required. > > > This is how postgres driver works. > > > > > > https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor > > > > > > We will have to see the behaviour of other drivers too. > > > > > > Let me know your thoughts here. > > > > > > Regards, > > > Rahul > > > > > > > > > On Sat, Oct 13, 2018 at 3:47 PM Rahul Raj <[email protected]> > wrote: > > > > > > > Hi Vitalii, > > > > > > > > There are two concrete implementations of the class - > > > > DrillPushLimitToScanRule LIMIT_ON_SCAN and > > > > DrillPushLimitToScanRule LIMIT_ON_PROJECT. > > > > LIMIT_ON_SCAN has a comment mentioning "For now only applies to > > Parquet. > > > > And pushdown only apply limit but not offset" > > > > > > > > Also I enabled debug mode and found LIMIT is not getting pushed to > the > > > > query. > > > > LimitPrel(fetch=[11]): rowcount = 11.0, cumulative cost = {83.0 > > rows, > > > > 226.0 cpu, 0.0 io, 585728.0 network, 0.0 memory}, id = 261 > > > > UnionExchangePrel: rowcount = 11.0, cumulative cost = {72.0 > rows, > > > > 182.0 cpu, 0.0 io, 585728.0 network, 0.0 memory}, id = 259 > > > > LimitPrel(fetch=[11]): rowcount = 11.0, cumulative cost = > {61.0 > > > > rows, 94.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 257 > > > > JdbcPrel(sql=[SELECT * FROM "u_g001"."executioniteration" > > WHERE > > > > "id" > 360500000 ]): rowcount = 50.0, cumulative cost = {50.0 rows, > > 50.0 > > > cpu > > > > > > > > Regarding the second point, its the java heap getting filled with > jdbc > > > > results. How do we address this? > > > > > > > > Regards, > > > > Rahul > > > > > > > > On Fri, Oct 12, 2018 at 8:11 PM Vitalii Diravka <[email protected]> > > > > wrote: > > > > > > > >> Hi Rahul, > > > >> > > > >> Drill has *DrillPushLimitToScanRule* [1] rule, which should do this > > > >> optimization, whether the GroupScan supports Limit Push Down. > > > >> Also you can verify in debug mode whether this rule is fired. > > > >> Possibly for some external DB (like MapR-DB) Drill should have the > > > >> separate > > > >> class for this optimization [2]. > > > >> > > > >> [1] > > > >> > > > >> > > > > > > https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushLimitToScanRule.java#L28 > > > >> [2] > > > >> > > > >> > > > > > > https://github.com/apache/drill/pull/1466/files#diff-4819b70118487d81bc9c46a04b0eaaa3R37 > > > >> > > > >> On Fri, Oct 12, 2018 at 3:19 PM Rahul Raj <[email protected]> > > wrote: > > > >> > > > >> > Hi, > > > >> > > > > >> > Drill does not push the LIMIT queries to external databases and I > > > >> assume it > > > >> > could be more related to Calcite. This leads to out of memory > > > situations > > > >> > while querying large table to view few records. Is there > something > > > that > > > >> > could be improved here? One solutions would be to push filters > down > > to > > > >> the > > > >> > DB and/or combined with some JDBC batch size limit to flush a part > > as > > > >> > parquet. > > > >> > > > > >> > Regards, > > > >> > Rahul > > > >> > > > > >> > -- > > > >> > _*** This email and any files transmitted with it are confidential > > and > > > >> > intended solely for the use of the individual or entity to whom it > > is > > > >> > addressed. If you are not the named addressee then you should not > > > >> > disseminate, distribute or copy this e-mail. Please notify the > > sender > > > >> > immediately and delete this e-mail from your system.***_ > > > >> > > > > >> > > > > > > > > > > -- > > > _*** This email and any files transmitted with it are confidential and > > > intended solely for the use of the individual or entity to whom it is > > > addressed. If you are not the named addressee then you should not > > > disseminate, distribute or copy this e-mail. Please notify the sender > > > immediately and delete this e-mail from your system.***_ > > > > > > > -- > _*** This email and any files transmitted with it are confidential and > intended solely for the use of the individual or entity to whom it is > addressed. If you are not the named addressee then you should not > disseminate, distribute or copy this e-mail. Please notify the sender > immediately and delete this e-mail from your system.***_ >
