Re: Phoenix-Spark driver structured streaming support?
Thanks, but this is just a sample test i was doing. My use case is to read data from Kafka in a streaming fashion and then write to Phoenix. On Thu, Feb 22, 2018 at 5:15 PM, Pedro Boadowrote: > No it's not supported. > > Why don't you just run your example in spark batch and save the > dataframe/rdd to Phoenix? Your data is coming from a json file (which at > the end is a static source, not a stream) > > > > On 23 Feb 2018 01:08, "Suhas H M" wrote: > > > Hi, > > > > Is spark structured streaming supported using Phoenix-Spark driver? > > when phoenix-spark driver is used to write the structured streamed data, > > we get the exception > > > > Exception in thread "main" java.lang.UnsupportedOperationException: Data > > source org.apache.phoenix.spark does not support streamed writing > > at > > org.apache.spark.sql.execution.datasources.DataSource.createSink( > > DataSource.scala:287) > > at > > org.apache.spark.sql.streaming.DataStreamWriter. > > start(DataStreamWriter.scala:266) > > > > > > > > Code: > > > > Dataset inputDF = > > sparkSession > > .readStream() > > .schema(jsonSchema) > > .json(inputPath); > > > > > > > > StreamingQuery query = inputDF > > .writeStream() > > .format("org.apache.phoenix.spark") > > .outputMode(OutputMode.Complete()) > > .option("zkUrl", "localhost:2181") > > .option("table","SHM2") > > .start(); > > > > query.awaitTermination(); > > > > > > Jira - https://issues.apache.org/jira/browse/PHOENIX-4627 > > >
[jira] [Commented] (PHOENIX-4530) Do not collect delete markers during major compaction of table with disabled mutable indexes
[ https://issues.apache.org/jira/browse/PHOENIX-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373844#comment-16373844 ] Hudson commented on PHOENIX-4530: - FAILURE: Integrated in Jenkins build Phoenix-4.x-HBase-0.98 #1821 (See [https://builds.apache.org/job/Phoenix-4.x-HBase-0.98/1821/]) PHOENIX-4530 Do not collect delete markers during major compaction of (vincentpoon: rev 178405d7012b05a683da57f5f2e53480e1dd6aed) * (add) phoenix-core/src/it/java/org/apache/phoenix/end2end/PartialScannerResultsDisabledIT.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/index/PartialIndexRebuilderIT.java * (delete) phoenix-core/src/it/java/org/apache/phoenix/end2end/UngroupedAggregateRegionObserverIT.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/index/MutableIndexIT.java * (edit) phoenix-core/src/test/java/org/apache/phoenix/util/TestUtil.java > Do not collect delete markers during major compaction of table with disabled > mutable indexes > > > Key: PHOENIX-4530 > URL: https://issues.apache.org/jira/browse/PHOENIX-4530 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 > Environment: >Reporter: James Taylor >Assignee: Vincent Poon >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4530.4.x-HBase-0.98.v2.patch, > PHOENIX-4530.4.x-HBase-1.1.v2.patch, PHOENIX-4530.master.v1.patch, > PHOENIX-4530.master.v2.patch > > > If major compaction occurs on a table with mutable indexes that have the > INDEX_DISABLE_TIMESTAMP set, we currently permanently disable the index, > forcing it to be manually rebuilt from scratch. This is to prevent it from > potentially being corrupted as we need the delete markers to remain in order > to guarantee the data table and index table remain in sync. > An alternate approach (mentioned by [~an...@apache.org] during review) is to > detect this case in a pre-compaction hook and set the compaction up so that > delete markers are not removed. This would have the advantage that we > wouldn't have to permanently disable the index and rebuild it from scratch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4530) Do not collect delete markers during major compaction of table with disabled mutable indexes
[ https://issues.apache.org/jira/browse/PHOENIX-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373814#comment-16373814 ] Hudson commented on PHOENIX-4530: - FAILURE: Integrated in Jenkins build Phoenix-4.x-HBase-1.3 #47 (See [https://builds.apache.org/job/Phoenix-4.x-HBase-1.3/47/]) PHOENIX-4530 Do not collect delete markers during major compaction of (vincentpoon: rev e8d3ed00cb2d44b35ebad019635cbbdc9a652681) * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/PartialScannerResultsDisabledIT.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java * (edit) phoenix-core/src/test/java/org/apache/phoenix/util/TestUtil.java * (delete) phoenix-core/src/it/java/org/apache/phoenix/end2end/UngroupedAggregateRegionObserverIT.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/index/PartialIndexRebuilderIT.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/index/MutableIndexIT.java > Do not collect delete markers during major compaction of table with disabled > mutable indexes > > > Key: PHOENIX-4530 > URL: https://issues.apache.org/jira/browse/PHOENIX-4530 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 > Environment: >Reporter: James Taylor >Assignee: Vincent Poon >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4530.4.x-HBase-0.98.v2.patch, > PHOENIX-4530.4.x-HBase-1.1.v2.patch, PHOENIX-4530.master.v1.patch, > PHOENIX-4530.master.v2.patch > > > If major compaction occurs on a table with mutable indexes that have the > INDEX_DISABLE_TIMESTAMP set, we currently permanently disable the index, > forcing it to be manually rebuilt from scratch. This is to prevent it from > potentially being corrupted as we need the delete markers to remain in order > to guarantee the data table and index table remain in sync. > An alternate approach (mentioned by [~an...@apache.org] during review) is to > detect this case in a pre-compaction hook and set the compaction up so that > delete markers are not removed. This would have the advantage that we > wouldn't have to permanently disable the index and rebuild it from scratch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4530) Do not collect delete markers during major compaction of table with disabled mutable indexes
[ https://issues.apache.org/jira/browse/PHOENIX-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373790#comment-16373790 ] Vincent Poon commented on PHOENIX-4530: --- Committed to 4.x and master branches. Still working on a 5.x patch, as HBase 2.0+ changed things a bit. > Do not collect delete markers during major compaction of table with disabled > mutable indexes > > > Key: PHOENIX-4530 > URL: https://issues.apache.org/jira/browse/PHOENIX-4530 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 > Environment: >Reporter: James Taylor >Assignee: Vincent Poon >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4530.4.x-HBase-0.98.v2.patch, > PHOENIX-4530.4.x-HBase-1.1.v2.patch, PHOENIX-4530.master.v1.patch, > PHOENIX-4530.master.v2.patch > > > If major compaction occurs on a table with mutable indexes that have the > INDEX_DISABLE_TIMESTAMP set, we currently permanently disable the index, > forcing it to be manually rebuilt from scratch. This is to prevent it from > potentially being corrupted as we need the delete markers to remain in order > to guarantee the data table and index table remain in sync. > An alternate approach (mentioned by [~an...@apache.org] during review) is to > detect this case in a pre-compaction hook and set the compaction up so that > delete markers are not removed. This would have the advantage that we > wouldn't have to permanently disable the index and rebuild it from scratch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4530) Do not collect delete markers during major compaction of table with disabled mutable indexes
[ https://issues.apache.org/jira/browse/PHOENIX-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated PHOENIX-4530: -- Affects Version/s: 4.13.0 Fix Version/s: 4.14.0 > Do not collect delete markers during major compaction of table with disabled > mutable indexes > > > Key: PHOENIX-4530 > URL: https://issues.apache.org/jira/browse/PHOENIX-4530 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.13.0 > Environment: >Reporter: James Taylor >Assignee: Vincent Poon >Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4530.4.x-HBase-0.98.v2.patch, > PHOENIX-4530.4.x-HBase-1.1.v2.patch, PHOENIX-4530.master.v1.patch, > PHOENIX-4530.master.v2.patch > > > If major compaction occurs on a table with mutable indexes that have the > INDEX_DISABLE_TIMESTAMP set, we currently permanently disable the index, > forcing it to be manually rebuilt from scratch. This is to prevent it from > potentially being corrupted as we need the delete markers to remain in order > to guarantee the data table and index table remain in sync. > An alternate approach (mentioned by [~an...@apache.org] during review) is to > detect this case in a pre-compaction hook and set the compaction up so that > delete markers are not removed. This would have the advantage that we > wouldn't have to permanently disable the index and rebuild it from scratch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4530) Do not collect delete markers during major compaction of table with disabled mutable indexes
[ https://issues.apache.org/jira/browse/PHOENIX-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Poon updated PHOENIX-4530: -- Attachment: PHOENIX-4530.4.x-HBase-1.1.v2.patch PHOENIX-4530.4.x-HBase-0.98.v2.patch > Do not collect delete markers during major compaction of table with disabled > mutable indexes > > > Key: PHOENIX-4530 > URL: https://issues.apache.org/jira/browse/PHOENIX-4530 > Project: Phoenix > Issue Type: Improvement > Environment: >Reporter: James Taylor >Assignee: Vincent Poon >Priority: Major > Attachments: PHOENIX-4530.4.x-HBase-0.98.v2.patch, > PHOENIX-4530.4.x-HBase-1.1.v2.patch, PHOENIX-4530.master.v1.patch, > PHOENIX-4530.master.v2.patch > > > If major compaction occurs on a table with mutable indexes that have the > INDEX_DISABLE_TIMESTAMP set, we currently permanently disable the index, > forcing it to be manually rebuilt from scratch. This is to prevent it from > potentially being corrupted as we need the delete markers to remain in order > to guarantee the data table and index table remain in sync. > An alternate approach (mentioned by [~an...@apache.org] during review) is to > detect this case in a pre-compaction hook and set the compaction up so that > delete markers are not removed. This would have the advantage that we > wouldn't have to permanently disable the index and rebuild it from scratch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Phoenix-Spark driver structured streaming support?
No it's not supported. Why don't you just run your example in spark batch and save the dataframe/rdd to Phoenix? Your data is coming from a json file (which at the end is a static source, not a stream) On 23 Feb 2018 01:08, "Suhas H M"wrote: > Hi, > > Is spark structured streaming supported using Phoenix-Spark driver? > when phoenix-spark driver is used to write the structured streamed data, > we get the exception > > Exception in thread "main" java.lang.UnsupportedOperationException: Data > source org.apache.phoenix.spark does not support streamed writing > at > org.apache.spark.sql.execution.datasources.DataSource.createSink( > DataSource.scala:287) > at > org.apache.spark.sql.streaming.DataStreamWriter. > start(DataStreamWriter.scala:266) > > > > Code: > > Dataset inputDF = > sparkSession > .readStream() > .schema(jsonSchema) > .json(inputPath); > > > > StreamingQuery query = inputDF > .writeStream() > .format("org.apache.phoenix.spark") > .outputMode(OutputMode.Complete()) > .option("zkUrl", "localhost:2181") > .option("table","SHM2") > .start(); > > query.awaitTermination(); > > > Jira - https://issues.apache.org/jira/browse/PHOENIX-4627 >
Phoenix-Spark driver structured streaming support?
Hi, Is spark structured streaming supported using Phoenix-Spark driver? when phoenix-spark driver is used to write the structured streamed data, we get the exception Exception in thread "main" java.lang.UnsupportedOperationException: Data source org.apache.phoenix.spark does not support streamed writing at org.apache.spark.sql.execution.datasources.DataSource.createSink(DataSource.scala:287) at org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:266) Code: Dataset inputDF = sparkSession .readStream() .schema(jsonSchema) .json(inputPath); StreamingQuery query = inputDF .writeStream() .format("org.apache.phoenix.spark") .outputMode(OutputMode.Complete()) .option("zkUrl", "localhost:2181") .option("table","SHM2") .start(); query.awaitTermination(); Jira - https://issues.apache.org/jira/browse/PHOENIX-4627
[jira] [Commented] (PHOENIX-4607) Allow PhoenixInputFormat to use tenant-specific connections
[ https://issues.apache.org/jira/browse/PHOENIX-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373751#comment-16373751 ] Geoffrey Jacoby commented on PHOENIX-4607: -- The test failures go away when run against HBase 1.3. There was a failure from ParallelIteratorsIT, but it was a minicluster failure and went away when I ran it standalone. > Allow PhoenixInputFormat to use tenant-specific connections > --- > > Key: PHOENIX-4607 > URL: https://issues.apache.org/jira/browse/PHOENIX-4607 > Project: Phoenix > Issue Type: New Feature >Affects Versions: 4.13.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > Labels: mapreduce > Attachments: PHOENIX-4607.patch > > > When using Phoenix's MapReduce integration, the actual connections for the > SELECT query are created by PhoenixInputFormat. While PhoenixInputFormat has > support for a few connection properties such as SCN, a TenantId is not one of > them. > Add the ability to specify a TenantId for the PhoenixInputFormat's > connections to use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PHOENIX-4627) Phoenix-Spark doesn't supported spark structured streaming
Suhas created PHOENIX-4627: -- Summary: Phoenix-Spark doesn't supported spark structured streaming Key: PHOENIX-4627 URL: https://issues.apache.org/jira/browse/PHOENIX-4627 Project: Phoenix Issue Type: Bug Affects Versions: 4.11.0 Reporter: Suhas Spark 2.x introduced a feature of structured streaming ([https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html).] However, when phoenix-spark driver is used to write the structured streamed data, we get the exception Exception in thread "main" java.lang.UnsupportedOperationException: Data source org.apache.phoenix.spark does not support streamed writing at org.apache.spark.sql.execution.datasources.DataSource.createSink(DataSource.scala:287) at org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:266) Code: Dataset inputDF = sparkSession .readStream() .schema(jsonSchema) .json(inputPath); StreamingQuery query = inputDF .writeStream() .format("org.apache.phoenix.spark") .outputMode(OutputMode.Complete()) .option("zkUrl", "localhost:2181") .option("table","SHM2") .start(); query.awaitTermination(); -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PHOENIX-4626) Increase time allowed for partial index rebuild to complete
James Taylor created PHOENIX-4626: - Summary: Increase time allowed for partial index rebuild to complete Key: PHOENIX-4626 URL: https://issues.apache.org/jira/browse/PHOENIX-4626 Project: Phoenix Issue Type: Bug Reporter: James Taylor Assignee: James Taylor Fix For: 4.14.0 Currently a mutable index is marked as disabled if it cannot be caught up by the partial index rebuilder after 30 minutes. This is too short a time. Instead, we should allow 24 hours. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4148) COUNT(DISTINCT(...)) should have a memory size limit
[ https://issues.apache.org/jira/browse/PHOENIX-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373707#comment-16373707 ] James Taylor commented on PHOENIX-4148: --- I'll go ahead and commit your patch unless I hear objections, [~lhofhansl]. I'd like to get this into 4.14 as it prevents a potential RS crash. The code/approach can always be changed/improved down the road. > COUNT(DISTINCT(...)) should have a memory size limit > > > Key: PHOENIX-4148 > URL: https://issues.apache.org/jira/browse/PHOENIX-4148 > Project: Phoenix > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 4.14.0 > > Attachments: 4148.txt > > > I just managed to kill (hang) a region server by issuing a > COUNT(DISTINCT(...)) query over a column with very high cardinality (20m in > this case). > This is perhaps not a useful thing to do, but Phoenix should nonetheless not > allow to have a server fail because of a query. > [~jamestaylor], I see there GlobalMemoryManager, but I do not quite see how > I'd get a reference to one, once needs a tenant id, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4607) Allow PhoenixInputFormat to use tenant-specific connections
[ https://issues.apache.org/jira/browse/PHOENIX-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373636#comment-16373636 ] James Taylor commented on PHOENIX-4607: --- Phoenix on master uses HBase 1.4 which has known test failures that need to be triaged and fixed (see PHOENIX-4539). I'd recommend trying on 4.x-HBase-1.3 branch. > Allow PhoenixInputFormat to use tenant-specific connections > --- > > Key: PHOENIX-4607 > URL: https://issues.apache.org/jira/browse/PHOENIX-4607 > Project: Phoenix > Issue Type: New Feature >Affects Versions: 4.13.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > Labels: mapreduce > Attachments: PHOENIX-4607.patch > > > When using Phoenix's MapReduce integration, the actual connections for the > SELECT query are created by PhoenixInputFormat. While PhoenixInputFormat has > support for a few connection properties such as SCN, a TenantId is not one of > them. > Add the ability to specify a TenantId for the PhoenixInputFormat's > connections to use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4607) Allow PhoenixInputFormat to use tenant-specific connections
[ https://issues.apache.org/jira/browse/PHOENIX-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373630#comment-16373630 ] Geoffrey Jacoby commented on PHOENIX-4607: -- HadoopQA doesn't seem to be posting to the JIRA, but the test run did occur. There were 10 test failures, listed here: [https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-PHOENIX-Build/1766/#showFailuresLink] When I run mvn clean verify locally, I can reproduce all the test failures, so they're not flappers – but the same test failures occur if I run mvn clean verify from an unchanged pull of the master branch. So, while we all seem due for another round of test cleanup, this patch should be fine. [~jamestaylor], wdyt? > Allow PhoenixInputFormat to use tenant-specific connections > --- > > Key: PHOENIX-4607 > URL: https://issues.apache.org/jira/browse/PHOENIX-4607 > Project: Phoenix > Issue Type: New Feature >Affects Versions: 4.13.0 >Reporter: Geoffrey Jacoby >Assignee: Geoffrey Jacoby >Priority: Major > Labels: mapreduce > Attachments: PHOENIX-4607.patch > > > When using Phoenix's MapReduce integration, the actual connections for the > SELECT query are created by PhoenixInputFormat. While PhoenixInputFormat has > support for a few connection properties such as SCN, a TenantId is not one of > them. > Add the ability to specify a TenantId for the PhoenixInputFormat's > connections to use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4625) memory leak in PhoenixConnection if scanner renew lease thread is not enabled
[ https://issues.apache.org/jira/browse/PHOENIX-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373502#comment-16373502 ] Hudson commented on PHOENIX-4625: - SUCCESS: Integrated in Jenkins build Phoenix-4.x-HBase-1.3 #46 (See [https://builds.apache.org/job/Phoenix-4.x-HBase-1.3/46/]) PHOENIX-4625 memory leak in PhoenixConnection if scanner renew lease (tdsilva: rev cb682c9a19695ed33c7e6c3889c20b4071cfa9e7) * (edit) phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixConnection.java > memory leak in PhoenixConnection if scanner renew lease thread is not enabled > - > > Key: PHOENIX-4625 > URL: https://issues.apache.org/jira/browse/PHOENIX-4625 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0 >Reporter: Vikas Vishwakarma >Assignee: Vikas Vishwakarma >Priority: Major > Fix For: 4.14.0, 5.0.0 > > Attachments: GC_After_fix.png, GC_Leak.png, PHOENIX-4625.patch, QS.png > > > We have two different code path > # In ConnectionQueryServicesImpl RenewLeaseTasks is scheduled based on the > following checks if renew lease feature is supported and if the renew lease > config is enabled > supportsFeature(ConnectionQueryServices.Feature.RENEW_LEASE) && > renewLeaseEnabled > # In PhoenixConnection for every scan iterator is added to a Queue for lease > renewal based on just the check if the renew lease feature is supported > services.supportsFeature(Feature.RENEW_LEASE) > In PhoenixConnection we however miss the check whether renew lease config is > enabled (phoenix.scanner.lease.renew.enabled) > > Now consider a situation where Renew lease feature is supported but > phoenix.scanner.lease.renew.enabled is set to false in hbase-site.xml . In > this case PhoenixConnection will keep adding the iterators for every scan > into the scannerQueue for renewal based on the feature supported check but > the renewal task is not running because phoenix.scanner.lease.renew.enabled > is set to false, so the scannerQueue will keep growing as long as the > PhoenixConnection is alive and multiple scans requests are coming on this > connection. > > We have a use case that uses a single PhoenixConnection that is perpetual and > does billions of scans on this connection. In this case scannerQueue is > growing to several GB's and ultimately leading to Consecutive Full GC's/OOM > > Add iterators for Lease renewal in PhoenixConnection > = > {code:java} > > public void addIteratorForLeaseRenewal(@Nonnull TableResultIterator itr) { > if (services.supportsFeature(Feature.RENEW_LEASE)) > { >checkNotNull(itr); scannerQueue.add(new > WeakReference(itr)); > } > } > {code} > > Starting the RenewLeaseTask > = > checks if Feature.RENEW_LEASE is supported and if > phoenix.scanner.lease.renew.enabled is true and starts the RenewLeaseTask > {code:java} > > ConnectionQueryServicesImpl { > > this.renewLeaseEnabled = config.getBoolean(RENEW_LEASE_ENABLED, > DEFAULT_RENEW_LEASE_ENABLED); > . > @Override > public boolean isRenewingLeasesEnabled(){ >return supportsFeature(ConnectionQueryServices.Feature.RENEW_LEASE) && > renewLeaseEnabled; > } > private void scheduleRenewLeaseTasks() { > if (isRenewingLeasesEnabled()) { >renewLeaseExecutor = >Executors.newScheduledThreadPool(renewLeasePoolSize, > renewLeaseThreadFactory); >for (LinkedBlockingQueueq : > connectionQueues) { > renewLeaseExecutor.scheduleAtFixedRate(new RenewLeaseTask(q), 0, > renewLeaseTaskFrequency, TimeUnit.MILLISECONDS); >} > } > } > ... > } > {code} > > To solve this We must add both checks in PhoenixConnection if the feature is > supported and if the config is enabled before adding the iterators to > scannerQueue > ConnectionQueryServices.Feature.RENEW_LEASE is true && > phoenix.scanner.lease.renew.enabled is true > instead of just checking if the feature > ConnectionQueryServices.Feature.RENEW_LEASE is supported > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4625) memory leak in PhoenixConnection if scanner renew lease thread is not enabled
[ https://issues.apache.org/jira/browse/PHOENIX-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373407#comment-16373407 ] Hudson commented on PHOENIX-4625: - SUCCESS: Integrated in Jenkins build Phoenix-4.x-HBase-0.98 #1820 (See [https://builds.apache.org/job/Phoenix-4.x-HBase-0.98/1820/]) PHOENIX-4625 memory leak in PhoenixConnection if scanner renew lease (tdsilva: rev 4fc3f7545db831e83bc82783a0655df79821c107) * (edit) phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixConnection.java > memory leak in PhoenixConnection if scanner renew lease thread is not enabled > - > > Key: PHOENIX-4625 > URL: https://issues.apache.org/jira/browse/PHOENIX-4625 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0 >Reporter: Vikas Vishwakarma >Assignee: Vikas Vishwakarma >Priority: Major > Fix For: 4.14.0, 5.0.0 > > Attachments: GC_After_fix.png, GC_Leak.png, PHOENIX-4625.patch, QS.png > > > We have two different code path > # In ConnectionQueryServicesImpl RenewLeaseTasks is scheduled based on the > following checks if renew lease feature is supported and if the renew lease > config is enabled > supportsFeature(ConnectionQueryServices.Feature.RENEW_LEASE) && > renewLeaseEnabled > # In PhoenixConnection for every scan iterator is added to a Queue for lease > renewal based on just the check if the renew lease feature is supported > services.supportsFeature(Feature.RENEW_LEASE) > In PhoenixConnection we however miss the check whether renew lease config is > enabled (phoenix.scanner.lease.renew.enabled) > > Now consider a situation where Renew lease feature is supported but > phoenix.scanner.lease.renew.enabled is set to false in hbase-site.xml . In > this case PhoenixConnection will keep adding the iterators for every scan > into the scannerQueue for renewal based on the feature supported check but > the renewal task is not running because phoenix.scanner.lease.renew.enabled > is set to false, so the scannerQueue will keep growing as long as the > PhoenixConnection is alive and multiple scans requests are coming on this > connection. > > We have a use case that uses a single PhoenixConnection that is perpetual and > does billions of scans on this connection. In this case scannerQueue is > growing to several GB's and ultimately leading to Consecutive Full GC's/OOM > > Add iterators for Lease renewal in PhoenixConnection > = > {code:java} > > public void addIteratorForLeaseRenewal(@Nonnull TableResultIterator itr) { > if (services.supportsFeature(Feature.RENEW_LEASE)) > { >checkNotNull(itr); scannerQueue.add(new > WeakReference(itr)); > } > } > {code} > > Starting the RenewLeaseTask > = > checks if Feature.RENEW_LEASE is supported and if > phoenix.scanner.lease.renew.enabled is true and starts the RenewLeaseTask > {code:java} > > ConnectionQueryServicesImpl { > > this.renewLeaseEnabled = config.getBoolean(RENEW_LEASE_ENABLED, > DEFAULT_RENEW_LEASE_ENABLED); > . > @Override > public boolean isRenewingLeasesEnabled(){ >return supportsFeature(ConnectionQueryServices.Feature.RENEW_LEASE) && > renewLeaseEnabled; > } > private void scheduleRenewLeaseTasks() { > if (isRenewingLeasesEnabled()) { >renewLeaseExecutor = >Executors.newScheduledThreadPool(renewLeasePoolSize, > renewLeaseThreadFactory); >for (LinkedBlockingQueueq : > connectionQueues) { > renewLeaseExecutor.scheduleAtFixedRate(new RenewLeaseTask(q), 0, > renewLeaseTaskFrequency, TimeUnit.MILLISECONDS); >} > } > } > ... > } > {code} > > To solve this We must add both checks in PhoenixConnection if the feature is > supported and if the config is enabled before adding the iterators to > scannerQueue > ConnectionQueryServices.Feature.RENEW_LEASE is true && > phoenix.scanner.lease.renew.enabled is true > instead of just checking if the feature > ConnectionQueryServices.Feature.RENEW_LEASE is supported > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4548) UpgradeUtil.mapChildViewsToNamespace does not handle multi-tenant views that have the same name.
[ https://issues.apache.org/jira/browse/PHOENIX-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated PHOENIX-4548: Fix Version/s: 5.0.0 > UpgradeUtil.mapChildViewsToNamespace does not handle multi-tenant views that > have the same name. > > > Key: PHOENIX-4548 > URL: https://issues.apache.org/jira/browse/PHOENIX-4548 > Project: Phoenix > Issue Type: Bug >Reporter: Thomas D'Silva >Assignee: Thomas D'Silva >Priority: Major > Fix For: 4.14.0, 5.0.0 > > Attachments: PHOENIX-4548.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4548) UpgradeUtil.mapChildViewsToNamespace does not handle multi-tenant views that have the same name.
[ https://issues.apache.org/jira/browse/PHOENIX-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373228#comment-16373228 ] Thomas D'Silva commented on PHOENIX-4548: - [~rajeshbabu] Sorry for the delayed response, looks like this patch is already committed to 5.x branch. > UpgradeUtil.mapChildViewsToNamespace does not handle multi-tenant views that > have the same name. > > > Key: PHOENIX-4548 > URL: https://issues.apache.org/jira/browse/PHOENIX-4548 > Project: Phoenix > Issue Type: Bug >Reporter: Thomas D'Silva >Assignee: Thomas D'Silva >Priority: Major > Fix For: 4.14.0, 5.0.0 > > Attachments: PHOENIX-4548.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4530) Do not collect delete markers during major compaction of table with disabled mutable indexes
[ https://issues.apache.org/jira/browse/PHOENIX-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373225#comment-16373225 ] James Taylor commented on PHOENIX-4530: --- Thanks for confirming, [~vincentpoon]. Please commit to 4.x, 5.x, and master branches. > Do not collect delete markers during major compaction of table with disabled > mutable indexes > > > Key: PHOENIX-4530 > URL: https://issues.apache.org/jira/browse/PHOENIX-4530 > Project: Phoenix > Issue Type: Improvement > Environment: >Reporter: James Taylor >Assignee: Vincent Poon >Priority: Major > Attachments: PHOENIX-4530.master.v1.patch, > PHOENIX-4530.master.v2.patch > > > If major compaction occurs on a table with mutable indexes that have the > INDEX_DISABLE_TIMESTAMP set, we currently permanently disable the index, > forcing it to be manually rebuilt from scratch. This is to prevent it from > potentially being corrupted as we need the delete markers to remain in order > to guarantee the data table and index table remain in sync. > An alternate approach (mentioned by [~an...@apache.org] during review) is to > detect this case in a pre-compaction hook and set the compaction up so that > delete markers are not removed. This would have the advantage that we > wouldn't have to permanently disable the index and rebuild it from scratch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4333) Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373223#comment-16373223 ] James Taylor commented on PHOENIX-4333: --- Please review, [~tdsilva]. If you have spare cycles to take a look too, [~samarthjain], that'd be much appreciated. The basic idea of the patch is to detect if we encounter a region with no guideposts, and if so, then we set the estimated timestamp to null. This covers the issue described for this JIRA. Also, some edge cases are now accounted for: namely when the guidepost is equal to the start key of the scan or equal to the end region key. > Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: James Taylor >Priority: Major > Labels: SFDC, stats > Fix For: 4.14.0 > > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, > PHOENIX-4333_v2.patch, PHOENIX-4333_v3.patch, PHOENIX-4333_wip1.patch, > PHOENIX-4333_wip2.patch, PHOENIX-4333_wip3.patch, PHOENIX-4333_wip4.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4333) Incorrect estimate when stats are updated on a tenant specific view
[ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-4333: -- Attachment: PHOENIX-4333_v3.patch > Incorrect estimate when stats are updated on a tenant specific view > --- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.12.0 >Reporter: Mujtaba Chohan >Assignee: James Taylor >Priority: Major > Labels: SFDC, stats > Fix For: 4.14.0 > > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, > PHOENIX-4333_v2.patch, PHOENIX-4333_v3.patch, PHOENIX-4333_wip1.patch, > PHOENIX-4333_wip2.patch, PHOENIX-4333_wip3.patch, PHOENIX-4333_wip4.patch > > > Consider two tenants A, B with tenant specific view on 2 separate > regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view > yield partial results (only contains stats for B,1) which are incorrect even > though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4625) memory leak in PhoenixConnection if scanner renew lease thread is not enabled
[ https://issues.apache.org/jira/browse/PHOENIX-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373201#comment-16373201 ] Thomas D'Silva commented on PHOENIX-4625: - Sure I will get this committed today. > memory leak in PhoenixConnection if scanner renew lease thread is not enabled > - > > Key: PHOENIX-4625 > URL: https://issues.apache.org/jira/browse/PHOENIX-4625 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0 >Reporter: Vikas Vishwakarma >Assignee: Vikas Vishwakarma >Priority: Major > Fix For: 4.14.0 > > Attachments: GC_After_fix.png, GC_Leak.png, PHOENIX-4625.patch, QS.png > > > We have two different code path > # In ConnectionQueryServicesImpl RenewLeaseTasks is scheduled based on the > following checks if renew lease feature is supported and if the renew lease > config is enabled > supportsFeature(ConnectionQueryServices.Feature.RENEW_LEASE) && > renewLeaseEnabled > # In PhoenixConnection for every scan iterator is added to a Queue for lease > renewal based on just the check if the renew lease feature is supported > services.supportsFeature(Feature.RENEW_LEASE) > In PhoenixConnection we however miss the check whether renew lease config is > enabled (phoenix.scanner.lease.renew.enabled) > > Now consider a situation where Renew lease feature is supported but > phoenix.scanner.lease.renew.enabled is set to false in hbase-site.xml . In > this case PhoenixConnection will keep adding the iterators for every scan > into the scannerQueue for renewal based on the feature supported check but > the renewal task is not running because phoenix.scanner.lease.renew.enabled > is set to false, so the scannerQueue will keep growing as long as the > PhoenixConnection is alive and multiple scans requests are coming on this > connection. > > We have a use case that uses a single PhoenixConnection that is perpetual and > does billions of scans on this connection. In this case scannerQueue is > growing to several GB's and ultimately leading to Consecutive Full GC's/OOM > > Add iterators for Lease renewal in PhoenixConnection > = > {code:java} > > public void addIteratorForLeaseRenewal(@Nonnull TableResultIterator itr) { > if (services.supportsFeature(Feature.RENEW_LEASE)) > { >checkNotNull(itr); scannerQueue.add(new > WeakReference(itr)); > } > } > {code} > > Starting the RenewLeaseTask > = > checks if Feature.RENEW_LEASE is supported and if > phoenix.scanner.lease.renew.enabled is true and starts the RenewLeaseTask > {code:java} > > ConnectionQueryServicesImpl { > > this.renewLeaseEnabled = config.getBoolean(RENEW_LEASE_ENABLED, > DEFAULT_RENEW_LEASE_ENABLED); > . > @Override > public boolean isRenewingLeasesEnabled(){ >return supportsFeature(ConnectionQueryServices.Feature.RENEW_LEASE) && > renewLeaseEnabled; > } > private void scheduleRenewLeaseTasks() { > if (isRenewingLeasesEnabled()) { >renewLeaseExecutor = >Executors.newScheduledThreadPool(renewLeasePoolSize, > renewLeaseThreadFactory); >for (LinkedBlockingQueueq : > connectionQueues) { > renewLeaseExecutor.scheduleAtFixedRate(new RenewLeaseTask(q), 0, > renewLeaseTaskFrequency, TimeUnit.MILLISECONDS); >} > } > } > ... > } > {code} > > To solve this We must add both checks in PhoenixConnection if the feature is > supported and if the config is enabled before adding the iterators to > scannerQueue > ConnectionQueryServices.Feature.RENEW_LEASE is true && > phoenix.scanner.lease.renew.enabled is true > instead of just checking if the feature > ConnectionQueryServices.Feature.RENEW_LEASE is supported > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (PHOENIX-4625) memory leak in PhoenixConnection if scanner renew lease thread is not enabled
[ https://issues.apache.org/jira/browse/PHOENIX-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva reassigned PHOENIX-4625: --- Assignee: Vikas Vishwakarma > memory leak in PhoenixConnection if scanner renew lease thread is not enabled > - > > Key: PHOENIX-4625 > URL: https://issues.apache.org/jira/browse/PHOENIX-4625 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0 >Reporter: Vikas Vishwakarma >Assignee: Vikas Vishwakarma >Priority: Major > Fix For: 4.14.0 > > Attachments: GC_After_fix.png, GC_Leak.png, PHOENIX-4625.patch, QS.png > > > We have two different code path > # In ConnectionQueryServicesImpl RenewLeaseTasks is scheduled based on the > following checks if renew lease feature is supported and if the renew lease > config is enabled > supportsFeature(ConnectionQueryServices.Feature.RENEW_LEASE) && > renewLeaseEnabled > # In PhoenixConnection for every scan iterator is added to a Queue for lease > renewal based on just the check if the renew lease feature is supported > services.supportsFeature(Feature.RENEW_LEASE) > In PhoenixConnection we however miss the check whether renew lease config is > enabled (phoenix.scanner.lease.renew.enabled) > > Now consider a situation where Renew lease feature is supported but > phoenix.scanner.lease.renew.enabled is set to false in hbase-site.xml . In > this case PhoenixConnection will keep adding the iterators for every scan > into the scannerQueue for renewal based on the feature supported check but > the renewal task is not running because phoenix.scanner.lease.renew.enabled > is set to false, so the scannerQueue will keep growing as long as the > PhoenixConnection is alive and multiple scans requests are coming on this > connection. > > We have a use case that uses a single PhoenixConnection that is perpetual and > does billions of scans on this connection. In this case scannerQueue is > growing to several GB's and ultimately leading to Consecutive Full GC's/OOM > > Add iterators for Lease renewal in PhoenixConnection > = > {code:java} > > public void addIteratorForLeaseRenewal(@Nonnull TableResultIterator itr) { > if (services.supportsFeature(Feature.RENEW_LEASE)) > { >checkNotNull(itr); scannerQueue.add(new > WeakReference(itr)); > } > } > {code} > > Starting the RenewLeaseTask > = > checks if Feature.RENEW_LEASE is supported and if > phoenix.scanner.lease.renew.enabled is true and starts the RenewLeaseTask > {code:java} > > ConnectionQueryServicesImpl { > > this.renewLeaseEnabled = config.getBoolean(RENEW_LEASE_ENABLED, > DEFAULT_RENEW_LEASE_ENABLED); > . > @Override > public boolean isRenewingLeasesEnabled(){ >return supportsFeature(ConnectionQueryServices.Feature.RENEW_LEASE) && > renewLeaseEnabled; > } > private void scheduleRenewLeaseTasks() { > if (isRenewingLeasesEnabled()) { >renewLeaseExecutor = >Executors.newScheduledThreadPool(renewLeasePoolSize, > renewLeaseThreadFactory); >for (LinkedBlockingQueueq : > connectionQueues) { > renewLeaseExecutor.scheduleAtFixedRate(new RenewLeaseTask(q), 0, > renewLeaseTaskFrequency, TimeUnit.MILLISECONDS); >} > } > } > ... > } > {code} > > To solve this We must add both checks in PhoenixConnection if the feature is > supported and if the config is enabled before adding the iterators to > scannerQueue > ConnectionQueryServices.Feature.RENEW_LEASE is true && > phoenix.scanner.lease.renew.enabled is true > instead of just checking if the feature > ConnectionQueryServices.Feature.RENEW_LEASE is supported > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4609) Error Occurs while selecting a specific set of columns : ERROR 201 (22000): Illegal data. Expected length of at least 8 bytes, but had 2
[ https://issues.apache.org/jira/browse/PHOENIX-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373199#comment-16373199 ] Aman Jha commented on PHOENIX-4609: --- [~elserj] are you sure? the stack trace seems different for both the cases. Moreover, I cannot understand how SquirrelSQL client is able to handle the response, but java code for the same cannot? Is there any other way to process Phoenix ResultSet? > Error Occurs while selecting a specific set of columns : ERROR 201 (22000): > Illegal data. Expected length of at least 8 bytes, but had 2 > > > Key: PHOENIX-4609 > URL: https://issues.apache.org/jira/browse/PHOENIX-4609 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.11.0, 4.13.0 >Reporter: Aman Jha >Priority: Critical > Attachments: DML_DDL.sql, SelectStatement.sql, TestPhoenix.java > > > While selecting columns from a table, an error occurs for Illegal Data. > h3. _*ERROR 201 (22000): Illegal data. Expected length of at least 8 bytes, > but had 2*_ > The data is read/write only through the Phoenix Client. > Moreover, this error only occurs while running queries via Java Program only > and not through the Squirrel SQL client. Is there any other way to access > results from the ResultSet that is returned from Phoenix Client. > > *Environment Details* : > *HBase Version* : _1.2.6 on Hadoop 2.8.2_ > *Phoenix Version* : _4.11.0-HBase-1.2_ > *OS*: _LINUX(RHEL)_ > > The following error is caused when selecting columns via a Java Program > {code:java} > ERROR 201 (22000): Illegal data. Expected length of at least 8 bytes, but had > 2; nested exception is java.sql.SQLException: ERROR 201 (22000): Illegal > data. Expected length of at least 8 bytes, but had 2 > at > org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.doTranslate(SQLStateSQLExceptionTranslator.java:102) > at > org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73) > at > org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81) > at > org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81) > at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:419) > at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:474) > at > com.zycus.qe.service.impl.PhoenixHBaseDAOImpl.fetchAggregationResult(PhoenixHBaseDAOImpl.java:752) > ... 14 common frames omitted > Caused by: java.sql.SQLException: ERROR 201 (22000): Illegal data. Expected > length of at least 8 bytes, but had 2 > at > org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:483) > at > org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150) > at org.apache.phoenix.schema.KeyValueSchema.next(KeyValueSchema.java:213) > at org.apache.phoenix.schema.KeyValueSchema.iterator(KeyValueSchema.java:165) > at org.apache.phoenix.schema.KeyValueSchema.iterator(KeyValueSchema.java:171) > at org.apache.phoenix.schema.KeyValueSchema.iterator(KeyValueSchema.java:175) > at > org.apache.phoenix.expression.ProjectedColumnExpression.evaluate(ProjectedColumnExpression.java:115) > at > org.apache.phoenix.iterate.OrderedResultIterator.getResultIterator(OrderedResultIterator.java:260) > at > org.apache.phoenix.iterate.OrderedResultIterator.next(OrderedResultIterator.java:199) > at > org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.next(BaseGroupedAggregatingResultIterator.java:64) > at > org.apache.phoenix.iterate.LookAheadResultIterator$1.advance(LookAheadResultIterator.java:47) > at > org.apache.phoenix.iterate.LookAheadResultIterator.init(LookAheadResultIterator.java:59) > at > org.apache.phoenix.iterate.LookAheadResultIterator.next(LookAheadResultIterator.java:65) > at > org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.next(BaseGroupedAggregatingResultIterator.java:64) > at > org.apache.phoenix.iterate.OrderedResultIterator.getResultIterator(OrderedResultIterator.java:255) > at > org.apache.phoenix.iterate.OrderedResultIterator.next(OrderedResultIterator.java:199) > at > org.apache.phoenix.iterate.OrderedAggregatingResultIterator.next(OrderedAggregatingResultIterator.java:51) > at > org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) > at > org.apache.phoenix.execute.TupleProjectionPlan$1.next(TupleProjectionPlan.java:62) > at > org.apache.phoenix.iterate.LookAheadResultIterator$1.advance(LookAheadResultIterator.java:47) > at >
[jira] [Commented] (PHOENIX-4625) memory leak in PhoenixConnection if scanner renew lease thread is not enabled
[ https://issues.apache.org/jira/browse/PHOENIX-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373188#comment-16373188 ] James Taylor commented on PHOENIX-4625: --- [~tdsilva] - would you have some spare cycles to commit this branch to 4.x, master, and 5.x branches? > memory leak in PhoenixConnection if scanner renew lease thread is not enabled > - > > Key: PHOENIX-4625 > URL: https://issues.apache.org/jira/browse/PHOENIX-4625 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0 >Reporter: Vikas Vishwakarma >Priority: Major > Fix For: 4.14.0 > > Attachments: GC_After_fix.png, GC_Leak.png, PHOENIX-4625.patch, QS.png > > > We have two different code path > # In ConnectionQueryServicesImpl RenewLeaseTasks is scheduled based on the > following checks if renew lease feature is supported and if the renew lease > config is enabled > supportsFeature(ConnectionQueryServices.Feature.RENEW_LEASE) && > renewLeaseEnabled > # In PhoenixConnection for every scan iterator is added to a Queue for lease > renewal based on just the check if the renew lease feature is supported > services.supportsFeature(Feature.RENEW_LEASE) > In PhoenixConnection we however miss the check whether renew lease config is > enabled (phoenix.scanner.lease.renew.enabled) > > Now consider a situation where Renew lease feature is supported but > phoenix.scanner.lease.renew.enabled is set to false in hbase-site.xml . In > this case PhoenixConnection will keep adding the iterators for every scan > into the scannerQueue for renewal based on the feature supported check but > the renewal task is not running because phoenix.scanner.lease.renew.enabled > is set to false, so the scannerQueue will keep growing as long as the > PhoenixConnection is alive and multiple scans requests are coming on this > connection. > > We have a use case that uses a single PhoenixConnection that is perpetual and > does billions of scans on this connection. In this case scannerQueue is > growing to several GB's and ultimately leading to Consecutive Full GC's/OOM > > Add iterators for Lease renewal in PhoenixConnection > = > {code:java} > > public void addIteratorForLeaseRenewal(@Nonnull TableResultIterator itr) { > if (services.supportsFeature(Feature.RENEW_LEASE)) > { >checkNotNull(itr); scannerQueue.add(new > WeakReference(itr)); > } > } > {code} > > Starting the RenewLeaseTask > = > checks if Feature.RENEW_LEASE is supported and if > phoenix.scanner.lease.renew.enabled is true and starts the RenewLeaseTask > {code:java} > > ConnectionQueryServicesImpl { > > this.renewLeaseEnabled = config.getBoolean(RENEW_LEASE_ENABLED, > DEFAULT_RENEW_LEASE_ENABLED); > . > @Override > public boolean isRenewingLeasesEnabled(){ >return supportsFeature(ConnectionQueryServices.Feature.RENEW_LEASE) && > renewLeaseEnabled; > } > private void scheduleRenewLeaseTasks() { > if (isRenewingLeasesEnabled()) { >renewLeaseExecutor = >Executors.newScheduledThreadPool(renewLeasePoolSize, > renewLeaseThreadFactory); >for (LinkedBlockingQueueq : > connectionQueues) { > renewLeaseExecutor.scheduleAtFixedRate(new RenewLeaseTask(q), 0, > renewLeaseTaskFrequency, TimeUnit.MILLISECONDS); >} > } > } > ... > } > {code} > > To solve this We must add both checks in PhoenixConnection if the feature is > supported and if the config is enabled before adding the iterators to > scannerQueue > ConnectionQueryServices.Feature.RENEW_LEASE is true && > phoenix.scanner.lease.renew.enabled is true > instead of just checking if the feature > ConnectionQueryServices.Feature.RENEW_LEASE is supported > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4609) Error Occurs while selecting a specific set of columns : ERROR 201 (22000): Illegal data. Expected length of at least 8 bytes, but had 2
[ https://issues.apache.org/jira/browse/PHOENIX-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373131#comment-16373131 ] Josh Elser commented on PHOENIX-4609: - You may be running into PHOENIX-4588 > Error Occurs while selecting a specific set of columns : ERROR 201 (22000): > Illegal data. Expected length of at least 8 bytes, but had 2 > > > Key: PHOENIX-4609 > URL: https://issues.apache.org/jira/browse/PHOENIX-4609 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.11.0, 4.13.0 >Reporter: Aman Jha >Priority: Critical > Attachments: DML_DDL.sql, SelectStatement.sql, TestPhoenix.java > > > While selecting columns from a table, an error occurs for Illegal Data. > h3. _*ERROR 201 (22000): Illegal data. Expected length of at least 8 bytes, > but had 2*_ > The data is read/write only through the Phoenix Client. > Moreover, this error only occurs while running queries via Java Program only > and not through the Squirrel SQL client. Is there any other way to access > results from the ResultSet that is returned from Phoenix Client. > > *Environment Details* : > *HBase Version* : _1.2.6 on Hadoop 2.8.2_ > *Phoenix Version* : _4.11.0-HBase-1.2_ > *OS*: _LINUX(RHEL)_ > > The following error is caused when selecting columns via a Java Program > {code:java} > ERROR 201 (22000): Illegal data. Expected length of at least 8 bytes, but had > 2; nested exception is java.sql.SQLException: ERROR 201 (22000): Illegal > data. Expected length of at least 8 bytes, but had 2 > at > org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.doTranslate(SQLStateSQLExceptionTranslator.java:102) > at > org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73) > at > org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81) > at > org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81) > at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:419) > at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:474) > at > com.zycus.qe.service.impl.PhoenixHBaseDAOImpl.fetchAggregationResult(PhoenixHBaseDAOImpl.java:752) > ... 14 common frames omitted > Caused by: java.sql.SQLException: ERROR 201 (22000): Illegal data. Expected > length of at least 8 bytes, but had 2 > at > org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:483) > at > org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150) > at org.apache.phoenix.schema.KeyValueSchema.next(KeyValueSchema.java:213) > at org.apache.phoenix.schema.KeyValueSchema.iterator(KeyValueSchema.java:165) > at org.apache.phoenix.schema.KeyValueSchema.iterator(KeyValueSchema.java:171) > at org.apache.phoenix.schema.KeyValueSchema.iterator(KeyValueSchema.java:175) > at > org.apache.phoenix.expression.ProjectedColumnExpression.evaluate(ProjectedColumnExpression.java:115) > at > org.apache.phoenix.iterate.OrderedResultIterator.getResultIterator(OrderedResultIterator.java:260) > at > org.apache.phoenix.iterate.OrderedResultIterator.next(OrderedResultIterator.java:199) > at > org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.next(BaseGroupedAggregatingResultIterator.java:64) > at > org.apache.phoenix.iterate.LookAheadResultIterator$1.advance(LookAheadResultIterator.java:47) > at > org.apache.phoenix.iterate.LookAheadResultIterator.init(LookAheadResultIterator.java:59) > at > org.apache.phoenix.iterate.LookAheadResultIterator.next(LookAheadResultIterator.java:65) > at > org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.next(BaseGroupedAggregatingResultIterator.java:64) > at > org.apache.phoenix.iterate.OrderedResultIterator.getResultIterator(OrderedResultIterator.java:255) > at > org.apache.phoenix.iterate.OrderedResultIterator.next(OrderedResultIterator.java:199) > at > org.apache.phoenix.iterate.OrderedAggregatingResultIterator.next(OrderedAggregatingResultIterator.java:51) > at > org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) > at > org.apache.phoenix.execute.TupleProjectionPlan$1.next(TupleProjectionPlan.java:62) > at > org.apache.phoenix.iterate.LookAheadResultIterator$1.advance(LookAheadResultIterator.java:47) > at > org.apache.phoenix.iterate.LookAheadResultIterator.init(LookAheadResultIterator.java:59) > at > org.apache.phoenix.iterate.LookAheadResultIterator.next(LookAheadResultIterator.java:65) > at >