[jira] [Commented] (PHOENIX-4266) Avoid scanner caching in Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224197#comment-17224197 ] Lars Hofhansl commented on PHOENIX-4266: Just coming around to this again. The only time when setCaching is beneficial when we know ahead of time that we're not going to read all of the returned data - the only case to comes to mind is a LIMIT clause. > Avoid scanner caching in Phoenix > > > Key: PHOENIX-4266 > URL: https://issues.apache.org/jira/browse/PHOENIX-4266 > Project: Phoenix > Issue Type: Bug >Reporter: Lars Hofhansl >Priority: Major > Fix For: 5.1.0, 4.16.1, 4.17.0 > > > Phoenix tries to set caching on all scans. On HBase versions before 0.98 that > made sense, now it is the wrong thing to do. > HBase will by default do size based chunking. Setting scanner caching > prevents HBase doing this work. > We should avoid scanner everywhere, and only use in cases where we know the > number of rows to be returned (and that number is small). > [~sergey.soldatov], [~jamestaylor] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-4266) Avoid scanner caching in Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015575#comment-17015575 ] Andrew Kyle Purtell commented on PHOENIX-4266: -- Do you mean setCaching or setBatch()? Would a patch that simply removes all calls to Scan#setBatch() and setCaching() be an acceptable first cut? [~larsh] > Avoid scanner caching in Phoenix > > > Key: PHOENIX-4266 > URL: https://issues.apache.org/jira/browse/PHOENIX-4266 > Project: Phoenix > Issue Type: Bug >Reporter: Lars Hofhansl >Priority: Major > Fix For: 5.1.0, 4.16.0 > > > Phoenix tries to set caching on all scans. On HBase versions before 0.98 that > made sense, now it is the wrong thing to do. > HBase will by default do size based chunking. Setting scanner caching > prevents HBase doing this work. > We should avoid scanner everywhere, and only use in cases where we know the > number of rows to be returned (and that number is small). > [~sergey.soldatov], [~jamestaylor] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-4266) Avoid scanner caching in Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011484#comment-17011484 ] Lars Hofhansl commented on PHOENIX-4266: The RoundRobinResultIterator uses the scanner caching value to schedule the round robin scanning, we'd have to invent another value there - the "roundrobinness" is independent of the caching anyway. > Avoid scanner caching in Phoenix > > > Key: PHOENIX-4266 > URL: https://issues.apache.org/jira/browse/PHOENIX-4266 > Project: Phoenix > Issue Type: Bug >Reporter: Lars Hofhansl >Priority: Major > Fix For: 5.1.0, 4.16.0 > > > Phoenix tries to set caching on all scans. On HBase versions before 0.98 that > made sense, now it is the wrong thing to do. > HBase will by default do size based chunking. Setting scanner caching > prevents HBase doing this work. > We should avoid scanner everywhere, and only use in cases where we know the > number of rows to be returned (and that number is small). > [~sergey.soldatov], [~jamestaylor] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-4266) Avoid scanner caching in Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011462#comment-17011462 ] Lars Hofhansl commented on PHOENIX-4266: As part of this I can also do PHOENIX-5669 > Avoid scanner caching in Phoenix > > > Key: PHOENIX-4266 > URL: https://issues.apache.org/jira/browse/PHOENIX-4266 > Project: Phoenix > Issue Type: Bug >Reporter: Lars Hofhansl >Priority: Major > Fix For: 5.1.0, 4.16.0 > > > Phoenix tries to set caching on all scans. On HBase versions before 0.98 that > made sense, now it is the wrong thing to do. > HBase will by default do size based chunking. Setting scanner caching > prevents HBase doing this work. > We should avoid scanner everywhere, and only use in cases where we know the > number of rows to be returned (and that number is small). > [~sergey.soldatov], [~jamestaylor] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-4266) Avoid scanner caching in Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011460#comment-17011460 ] Lars Hofhansl commented on PHOENIX-4266: We can either set the default caching to Integer.MAX_VALUE (that's what HBase does now by default), or simply remove the default from Phoenix completely. In any case, it is time to do this after over 2 years. > Avoid scanner caching in Phoenix > > > Key: PHOENIX-4266 > URL: https://issues.apache.org/jira/browse/PHOENIX-4266 > Project: Phoenix > Issue Type: Bug >Reporter: Lars Hofhansl >Priority: Major > Fix For: 5.1.0, 4.16.0 > > > Phoenix tries to set caching on all scans. On HBase versions before 0.98 that > made sense, now it is the wrong thing to do. > HBase will by default do size based chunking. Setting scanner caching > prevents HBase doing this work. > We should avoid scanner everywhere, and only use in cases where we know the > number of rows to be returned (and that number is small). > [~sergey.soldatov], [~jamestaylor] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-4266) Avoid scanner caching in Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649135#comment-16649135 ] Lars Hofhansl commented on PHOENIX-4266: [~sergey.soldatov], so you have an opinion on this? > Avoid scanner caching in Phoenix > > > Key: PHOENIX-4266 > URL: https://issues.apache.org/jira/browse/PHOENIX-4266 > Project: Phoenix > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Sergey Soldatov >Priority: Major > > Phoenix tries to set caching on all scans. On HBase versions before 0.98 that > made sense, now it is the wrong thing to do. > HBase will by default do size based chunking. Setting scanner caching > prevents HBase doing this work. > We should avoid scanner everywhere, and only use in cases where we know the > number of rows to be returned (and that number is small). > [~sergey.soldatov], [~jamestaylor] -- This message was sent by Atlassian JIRA (v7.6.3#76005)