[jira] [Created] (PHOENIX-7409) Tests in CDCQueryIT are flapping when using a different number of salt buckets for CDC and data table
Hari Krishna Dara created PHOENIX-7409: -- Summary: Tests in CDCQueryIT are flapping when using a different number of salt buckets for CDC and data table Key: PHOENIX-7409 URL: https://issues.apache.org/jira/browse/PHOENIX-7409 Project: Phoenix Issue Type: Sub-task Reporter: Hari Krishna Dara When data table has salt buckets and you create a CDC with a different number of salt buckets, tests in CDCQueryIT are flapping. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-7409) Tests in CDCQueryIT are flapping when using a different number of salt buckets for CDC and data table
[ https://issues.apache.org/jira/browse/PHOENIX-7409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara reassigned PHOENIX-7409: -- Assignee: Hari Krishna Dara > Tests in CDCQueryIT are flapping when using a different number of salt > buckets for CDC and data table > - > > Key: PHOENIX-7409 > URL: https://issues.apache.org/jira/browse/PHOENIX-7409 > Project: Phoenix > Issue Type: Sub-task > Reporter: Hari Krishna Dara >Assignee: Hari Krishna Dara >Priority: Major > > When data table has salt buckets and you create a CDC with a different number > of salt buckets, tests in CDCQueryIT are flapping. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-7343) Support for complex types in CDC
[ https://issues.apache.org/jira/browse/PHOENIX-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara reassigned PHOENIX-7343: -- Assignee: Hari Krishna Dara > Support for complex types in CDC > > > Key: PHOENIX-7343 > URL: https://issues.apache.org/jira/browse/PHOENIX-7343 > Project: Phoenix > Issue Type: Sub-task > Reporter: Hari Krishna Dara > Assignee: Hari Krishna Dara >Priority: Major > > Support for the two complex types, viz., ARRAY and JSON need to be added for > CDC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-7384) Null not handled in prepareDataTableScan of CDCGlobalIndexRegionScanner
[ https://issues.apache.org/jira/browse/PHOENIX-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara reassigned PHOENIX-7384: -- Assignee: Hari Krishna Dara > Null not handled in prepareDataTableScan of CDCGlobalIndexRegionScanner > --- > > Key: PHOENIX-7384 > URL: https://issues.apache.org/jira/browse/PHOENIX-7384 > Project: Phoenix > Issue Type: Bug >Reporter: Saurabh Rai > Assignee: Hari Krishna Dara >Priority: Major > > Null not handled in prepareDataTableScan of CDCGlobalIndexRegionScanner > {quote}Caused by: java.lang.NullPointerException at > org.apache.phoenix.util.CDCUtil.setupScanForCDC(CDCUtil.java:98) at > org.apache.phoenix.coprocessor.CDCGlobalIndexRegionScanner.prepareDataTableScan(CDCGlobalIndexRegionScanner.java:99) > at > org.apache.phoenix.coprocessor.UncoveredGlobalIndexRegionScanner.scanDataRows(UncoveredGlobalIndexRegionScanner.java:134) > at > org.apache.phoenix.coprocessor.UncoveredGlobalIndexRegionScanner$1.call(UncoveredGlobalIndexRegionScanner.java:177) > at > org.apache.phoenix.coprocessor.UncoveredGlobalIndexRegionScanner$1.call(UncoveredGlobalIndexRegionScanner.java:166) > at > org.apache.phoenix.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131) > at > org.apache.phoenix.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74) > at > org.apache.phoenix.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750){quote} > Null is being returned from this method - > https://github.com/apache/phoenix/blob/f1b0102301c06390c51716bebffc6ebd2eda7b19/phoenix-core-server/src/main/java/org/apache/phoenix/coprocessor/UncoveredIndexRegionScanner.java#L215 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7350) Update documentation
[ https://issues.apache.org/jira/browse/PHOENIX-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara updated PHOENIX-7350: --- Attachment: (was: cdc-docs.patch) > Update documentation > > > Key: PHOENIX-7350 > URL: https://issues.apache.org/jira/browse/PHOENIX-7350 > Project: Phoenix > Issue Type: Sub-task > Reporter: Hari Krishna Dara > Assignee: Hari Krishna Dara >Priority: Major > Attachments: cdc-docs.patch > > > Update the site pages for documentation on CDC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7350) Update documentation
[ https://issues.apache.org/jira/browse/PHOENIX-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara updated PHOENIX-7350: --- Attachment: cdc-docs.patch > Update documentation > > > Key: PHOENIX-7350 > URL: https://issues.apache.org/jira/browse/PHOENIX-7350 > Project: Phoenix > Issue Type: Sub-task > Reporter: Hari Krishna Dara > Assignee: Hari Krishna Dara >Priority: Major > Attachments: cdc-docs.patch > > > Update the site pages for documentation on CDC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7350) Update documentation
[ https://issues.apache.org/jira/browse/PHOENIX-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara updated PHOENIX-7350: --- Attachment: cdc-docs.patch > Update documentation > > > Key: PHOENIX-7350 > URL: https://issues.apache.org/jira/browse/PHOENIX-7350 > Project: Phoenix > Issue Type: Sub-task > Reporter: Hari Krishna Dara > Assignee: Hari Krishna Dara >Priority: Major > Attachments: cdc-docs.patch > > > Update the site pages for documentation on CDC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-7350) Update documentation
[ https://issues.apache.org/jira/browse/PHOENIX-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara reassigned PHOENIX-7350: -- Assignee: Hari Krishna Dara > Update documentation > > > Key: PHOENIX-7350 > URL: https://issues.apache.org/jira/browse/PHOENIX-7350 > Project: Phoenix > Issue Type: Sub-task > Reporter: Hari Krishna Dara > Assignee: Hari Krishna Dara >Priority: Major > > Update the site pages for documentation on CDC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7350) Update documentation
Hari Krishna Dara created PHOENIX-7350: -- Summary: Update documentation Key: PHOENIX-7350 URL: https://issues.apache.org/jira/browse/PHOENIX-7350 Project: Phoenix Issue Type: Sub-task Reporter: Hari Krishna Dara Update the site pages for documentation on CDC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7349) Improve the error messaging when CDC index is not yet active
Hari Krishna Dara created PHOENIX-7349: -- Summary: Improve the error messaging when CDC index is not yet active Key: PHOENIX-7349 URL: https://issues.apache.org/jira/browse/PHOENIX-7349 Project: Phoenix Issue Type: Sub-task Reporter: Hari Krishna Dara A query against CDC with index in building state, you get the below cryptic error: {quote}Error: ERROR 2014 (INT16): Row Value Constructor Offset Not Coercible to a Primary or Indexed RowKey. No table or index could be coerced to the PK as the offset. Or an uncovered index was attempted (state=INT16,code=2014) {quote} This situation doesn't happen for regular queries because such indexes get silently dropped. We need to ensure a more meaningful message in this case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7348) Default INCLUDE scopes given in CREATE CDC are not getting recognized
[ https://issues.apache.org/jira/browse/PHOENIX-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara updated PHOENIX-7348: --- Summary: Default INCLUDE scopes given in CREATE CDC are not getting recognized (was: Default INCLUDE scopes gives in CREATE CDC are not getting recognized) > Default INCLUDE scopes given in CREATE CDC are not getting recognized > - > > Key: PHOENIX-7348 > URL: https://issues.apache.org/jira/browse/PHOENIX-7348 > Project: Phoenix > Issue Type: Bug > Reporter: Hari Krishna Dara > Assignee: Hari Krishna Dara >Priority: Minor > > The CREATE CDC statement allows specifying a default for the change image > scopes which should get used when there is no query hint, but this value is > not getting used. There is also no test to catch this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7348) Default INCLUDE scopes gives in CREATE CDC are not getting recognized
Hari Krishna Dara created PHOENIX-7348: -- Summary: Default INCLUDE scopes gives in CREATE CDC are not getting recognized Key: PHOENIX-7348 URL: https://issues.apache.org/jira/browse/PHOENIX-7348 Project: Phoenix Issue Type: Bug Reporter: Hari Krishna Dara Assignee: Hari Krishna Dara The CREATE CDC statement allows specifying a default for the change image scopes which should get used when there is no query hint, but this value is not getting used. There is also no test to catch this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PHOENIX-7015) Extend UncoveredGlobalIndexRegionScanner for CDC region scanner usecase
[ https://issues.apache.org/jira/browse/PHOENIX-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara resolved PHOENIX-7015. Resolution: Fixed > Extend UncoveredGlobalIndexRegionScanner for CDC region scanner usecase > --- > > Key: PHOENIX-7015 > URL: https://issues.apache.org/jira/browse/PHOENIX-7015 > Project: Phoenix > Issue Type: Sub-task >Reporter: Viraj Jasani >Priority: Major > > For CDC region scanner usecase, extend UncoveredGlobalIndexRegionScanner to > CDCUncoveredGlobalIndexRegionScanner. The new region scanner for CDC performs > raw scan to index table and retrieve data table rows from index rows. > Using the time range, it can form a JSON blob to represent changes to the row > including pre and/or post row images. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7345) Support for alternative indexing scheme for CDC
Hari Krishna Dara created PHOENIX-7345: -- Summary: Support for alternative indexing scheme for CDC Key: PHOENIX-7345 URL: https://issues.apache.org/jira/browse/PHOENIX-7345 Project: Phoenix Issue Type: Sub-task Reporter: Hari Krishna Dara When a CDC table is created, an indexis created on the PHOENIX_ROW_TIMESTMAP(), which makes it possible to run range scans efficiently on the change timestamp. Since indexes always include the PK columns of the data table, additional filtering on the data table PK columns can also be done efficiently. However, a use case may require filtering based on a specific order of columns that includes both data and PK columns, so having support for customizing the PK for the CDC index will be beneficial. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7344) Support for Dynamic Columns
Hari Krishna Dara created PHOENIX-7344: -- Summary: Support for Dynamic Columns Key: PHOENIX-7344 URL: https://issues.apache.org/jira/browse/PHOENIX-7344 Project: Phoenix Issue Type: Sub-task Reporter: Hari Krishna Dara CDC recognizes changes for only those columns with static metadata, which means Dynamic Columns are completely ignored. We need to extend the functionality such that the SELECT queries on CDC objects to also support Dynamic Columns. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7343) Support for complex types in CDC
Hari Krishna Dara created PHOENIX-7343: -- Summary: Support for complex types in CDC Key: PHOENIX-7343 URL: https://issues.apache.org/jira/browse/PHOENIX-7343 Project: Phoenix Issue Type: Sub-task Reporter: Hari Krishna Dara Support for the two complex types, viz., ARRAY and JSON need to be added for CDC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7342) Optimize data table scan range based on the startRow/endRow from Scan
[ https://issues.apache.org/jira/browse/PHOENIX-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara updated PHOENIX-7342: --- Description: When a time range is specified in a SELECT query on CDC, it is possible to optimize the scan on data table by setting the time range. (was: Currently CDC can be created to use an UNCOVERED global index, but it should be possible to make use of a LOCAL index as well. ) Summary: Optimize data table scan range based on the startRow/endRow from Scan (was: Support for using a local index type) > Optimize data table scan range based on the startRow/endRow from Scan > - > > Key: PHOENIX-7342 > URL: https://issues.apache.org/jira/browse/PHOENIX-7342 > Project: Phoenix > Issue Type: Sub-task > Reporter: Hari Krishna Dara >Priority: Minor > > When a time range is specified in a SELECT query on CDC, it is possible to > optimize the scan on data table by setting the time range. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7342) Support for using a local index type
Hari Krishna Dara created PHOENIX-7342: -- Summary: Support for using a local index type Key: PHOENIX-7342 URL: https://issues.apache.org/jira/browse/PHOENIX-7342 Project: Phoenix Issue Type: Sub-task Reporter: Hari Krishna Dara Currently CDC can be created to use an UNCOVERED global index, but it should be possible to make use of a LOCAL index as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PHOENIX-7001) Change Data Capture leveraging Max Lookback and Uncovered Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara resolved PHOENIX-7001. Release Note: Change Data Capture (CDC) is a feature designed to capture changes to tables or updatable views in near real-time. This new functionality supports various use cases, including: * Real-Time Change Retrieval: Capture and retrieve changes as they happen or with minimal delay. * Flexible Time Range Queries: Perform queries based on specific time ranges, typically short periods such as the last few minutes, hours, or the last few days. * Comprehensive Change Tracking: Track all types of changes including insertions, updates, and deletions. Note that CDC does not differentiate between inserts and updates due to Phoenix’s handling of new versus existing rows. Key features of the CDC include: * Ordered Change Delivery: Changes are delivered in the order they arrive, ensuring the sequence of events is maintained. * Streamlined Integration: Changes can be visualized and delivered to applications similarly to how Phoenix query results are retrieved, but with enhancements to support multiple results for each row and inclusion of deleted rows. * Detailed Change Information: Optionally capture pre and post-change images of rows to provide a complete picture of modifications. This enhancement empowers applications to maintain an accurate and timely reflection of database changes, supporting a wide array of real-time data processing and monitoring scenarios. Resolution: Fixed > Change Data Capture leveraging Max Lookback and Uncovered Indexes > - > > Key: PHOENIX-7001 > URL: https://issues.apache.org/jira/browse/PHOENIX-7001 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Priority: Major > > The use cases for a Change Data Capture (CDC) feature are centered around > capturing changes to a given table (or updatable view) as these changes > happen in near real-time. A CDC application can retrieve changes in real-time > or with some delay, or even retrieves the same set of changes multiple times. > This means the CDC use case can be generalized as time range queries where > the time range is typically short such as last x minutes or hours or > expressed as a specific time range in the last n days where n is typically > less than 7. > A change is an update in a row. That is, a change is either updating one or > more columns of a table for a given row or deleting a row. It is desirable to > provide these changes in the order of their arrival. One can visualize the > delivery of these changes through a stream from a Phoenix table to the > application that is initiated by the application similar to the delivery of > any other Phoenix query results. The difference is that a regular query > result includes at most one result row for each row satisfying the query and > the deleted rows are not visible to the query result while the CDC > stream/result can include multiple result rows for each row and the result > includes deleted rows. Some use cases need to also get the pre and/or post > image of the row along with a change on the row. > The design proposed here leverages Phoenix Max Lookback and Uncovered Global > Indexes. The max lookback feature retains recent changes to a table, that is, > the changes that have been done in the last x days typically. This means that > the max lookback feature already captures the changes to a given table. > Currently, the max lookback age is configurable at the cluster level. We need > to extend this capability to be able to configure the max lookback age at the > table level so that each table can have a different max lookback age based on > its CDC application requirements. > To deliver the changes in the order of their arrival, we need a time based > index. This index should be uncovered as the changes are already retained in > the table by the max lookback feature. The arrival time will be defined as > the mutation timestamp generated by the server. An uncovered index would > allow us to efficiently and orderly access to the changes. Changes to an > index table are also preserved by the max lookback feature. > A CDC feature can be composed of the following components: > * {*}CDCUncoveredIndexRegionScanner{*}: This is a server side scanner on an > uncovered index used for CDC. This can inherit UncoveredIndexRegionScanner. > It goes through index table rows using a raw scan to identify data table rows > and retrieves these rows using a raw scan. Using the time range, it forms a > JSON blob to re
[jira] [Created] (PHOENIX-7239) When an uncovered index has different number of salt buckets than the data table, query returns no data
Hari Krishna Dara created PHOENIX-7239: -- Summary: When an uncovered index has different number of salt buckets than the data table, query returns no data Key: PHOENIX-7239 URL: https://issues.apache.org/jira/browse/PHOENIX-7239 Project: Phoenix Issue Type: Bug Environment: When you use a salt bucketing value for index that is different from that of data table, you get no results. As can be seen from below examples, when using index with buckets of 4 (same as the buckets in data table), there were results, but when it was 1 or 2, there were none. {{0: jdbc:phoenix:localhost> create table tsalt (k INTEGER PRIMARY KEY, v1 INTEGER) SALT_BUCKETS=4;}} {{0: jdbc:phoenix:localhost> upsert into tsalt (k, v1) VALUES (1, 100);}} {{0: jdbc:phoenix:localhost> create uncovered index tsaltidx on tsalt (PHOENIX_ROW_TIMESTAMP());}} {{select /*+ INDEX(TSALT TSALTIDX) */ * from TSALT;}} {{+---++}} {{| K | V1 |}} {{+---++}} {{+---++}} {{No rows selected (0.059 seconds)}} {{0: jdbc:phoenix:localhost> create uncovered index tsaltidx4 on tsalt (PHOENIX_ROW_TIMESTAMP());}} {{1 row affected (6.175 seconds)}} {{0: jdbc:phoenix:localhost> select /*+ INDEX(TSALT TSALTIDX4) */ * from TSALT;}} {{+---+-+}} {{| K | V1 |}} {{+---+-+}} {{| 1 | 100 |}} {{+---+-+}} {{1 row selected (0.035 seconds)}} {{0: jdbc:phoenix:localhost> create uncovered index tsaltidx on tsalt2 (PHOENIX_ROW_TIMESTAMP()) salt_buckets=2;}} {{0: jdbc:phoenix:localhost> select /*+ INDEX(TSALT TSALTIDX2) */ * from TSALT;}} {{+---++}} {{| K | V1 |}} {{+---++}} {{+---++}} {{No rows selected (0.059 seconds)}} Reporter: Hari Krishna Dara -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7238) Queries that use an uncovered index with SALT_BUCKETS=0, we get /0 error
[ https://issues.apache.org/jira/browse/PHOENIX-7238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara updated PHOENIX-7238: --- Summary: Queries that use an uncovered index with SALT_BUCKETS=0, we get /0 error (was: Zero is accepted for SALT_BUCKETS, but queries fail) > Queries that use an uncovered index with SALT_BUCKETS=0, we get /0 error > > > Key: PHOENIX-7238 > URL: https://issues.apache.org/jira/browse/PHOENIX-7238 > Project: Phoenix > Issue Type: Bug > Reporter: Hari Krishna Dara >Priority: Minor > > I have not done extensive testing on it, but when I specified > {{SALT_BUCKETS=0}} while creating an index, I get no error and this is a > valid use case to disable salting on index when the data table is salted: > {{create table tsalt (k INTEGER PRIMARY KEY, v1 INTEGER) SALT_BUCKETS=4;}} > {{upsert into tsalt (k, v1) VALUES (1, 100);}} > {{create uncovered index tsaltidx on tsalt (PHOENIX_ROW_TIMESTAMP()) > SALT_BUCKETS=0;}} > > From schema and hbase regions, it is correctly treated as no salting scenario. > > {{0: jdbc:phoenix:localhost> select salt_buckets from system.catalog where > table_name = 'TSALTIDX' and salt_buckets is not null;}} > +--+ > | SALT_BUCKETS | > +--+ > +--+ > No rows selected (0.026 seconds) > > {{hbase:001:0> list_regions 'TSALTIDX'}} > {{ SERVER_NAME | > REGION_NAME | START_KEY | END_KEY | SIZE | REQ | LOCALITY |}} > {{ - | > - | -- | > -- | - | - | -- |}} > {{ localhost,16020,1708958003582 | > TSALTIDX,,1708958225506.a72b20c15cecba23289a03cd6956ec15. | | > | 0 | 3 | 0.0 |}} > {{ 1 rows}} > However, when I query through the index, I get an {{ArithmeticError}} for > divide by zero. > {}0: jdbc:phoenix:localhost> select /*+ INDEX(TSALT TSALTIDX) */ * from > TSALT;{}}}{{{}Caused by: java.lang.ArithmeticException: / by zero{}}} > {{ at > org.apache.phoenix.schema.SaltingUtil.getSaltingByte(SaltingUtil.java:79)}} > {{ at > org.apache.phoenix.index.IndexMaintainer.buildDataRowKey(IndexMaintainer.java:916)}} > {{ at > org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:253)}} > {{ at > org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:274)}} > {{ at > org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.next(UncoveredIndexRegionScanner.java:382)}} > {{ at > org.apache.phoenix.coprocessor.BaseRegionScanner.nextRaw(BaseRegionScanner.java:56)}} > {{ at > org.apache.phoenix.iterate.RegionScannerFactory$1.nextRaw(RegionScannerFactory.java:257)}} > My suspicion is that table cells have number buckets stored as zero, so > PTableImpl for the index gets constructed to return 0 from {{getBucketNum()}} > and this is causing the divide by 0 error. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7238) Zero is accepted for SALT_BUCKETS, but queries fail
[ https://issues.apache.org/jira/browse/PHOENIX-7238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara updated PHOENIX-7238: --- Description: I have not done extensive testing on it, but when I specified {{SALT_BUCKETS=0}} while creating an index, I get no error and this is a valid use case to disable salting on index when the data table is salted: {{create table tsalt (k INTEGER PRIMARY KEY, v1 INTEGER) SALT_BUCKETS=4;}} {{upsert into tsalt (k, v1) VALUES (1, 100);}} {{create uncovered index tsaltidx on tsalt (PHOENIX_ROW_TIMESTAMP()) SALT_BUCKETS=0;}} >From schema and hbase regions, it is correctly treated as no salting scenario. {{0: jdbc:phoenix:localhost> select salt_buckets from system.catalog where table_name = 'TSALTIDX' and salt_buckets is not null;}} +--+ | SALT_BUCKETS | +--+ +--+ No rows selected (0.026 seconds) {{hbase:001:0> list_regions 'TSALTIDX'}} {{ SERVER_NAME | REGION_NAME | START_KEY | END_KEY | SIZE | REQ | LOCALITY |}} {{ - | - | -- | -- | - | - | -- |}} {{ localhost,16020,1708958003582 | TSALTIDX,,1708958225506.a72b20c15cecba23289a03cd6956ec15. | | | 0 | 3 | 0.0 |}} {{ 1 rows}} However, when I query through the index, I get an {{ArithmeticError}} for divide by zero. {}0: jdbc:phoenix:localhost> select /*+ INDEX(TSALT TSALTIDX) */ * from TSALT;{}}}{{{}Caused by: java.lang.ArithmeticException: / by zero{}}} {{ at org.apache.phoenix.schema.SaltingUtil.getSaltingByte(SaltingUtil.java:79)}} {{ at org.apache.phoenix.index.IndexMaintainer.buildDataRowKey(IndexMaintainer.java:916)}} {{ at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:253)}} {{ at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:274)}} {{ at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.next(UncoveredIndexRegionScanner.java:382)}} {{ at org.apache.phoenix.coprocessor.BaseRegionScanner.nextRaw(BaseRegionScanner.java:56)}} {{ at org.apache.phoenix.iterate.RegionScannerFactory$1.nextRaw(RegionScannerFactory.java:257)}} My suspicion is that table cells have number buckets stored as zero, so PTableImpl for the index gets constructed to return 0 from {{getBucketNum()}} and this is causing the divide by 0 error. was: I have not done extensive testing on it, but when I specified {{SALT_BUCKETS=0}} while creating an index, I get no error and this is a valid use case to disable salting on index when the data table is salted: {{create table tsalt (k INTEGER PRIMARY KEY, v1 INTEGER) SALT_BUCKETS=4;}} {{upsert into tsalt (k, v1) VALUES (1, 100);}} {{create uncovered index tsaltidx on tsalt (PHOENIX_ROW_TIMESTAMP()) SALT_BUCKETS=0;}} >From schema and hbase regions, it is correctly treated as no salting scenario. {{{}0: jdbc:phoenix:localhost> select salt_buckets from system.catalog where table_name = 'TSALTIDX' and salt_buckets is not null; +-{-}{-}+ |SALT_BUCKETS| +-+ \{+}{}}}{{{}--\{+}{}}}{{{}hbase:001:0> list_regions 'TSALTIDX'{}}} {{ SERVER_NAME | REGION_NAME | START_KEY | END_KEY | SIZE | REQ | LOCALITY |}} {{ - | - | -- | -- | - | - | -- |}} {{ localhost,16020,1708958003582 | TSALTIDX,,1708958225506.a72b20c15cecba23289a03cd6956ec15. | | | 0 | 3 | 0.0 |}} {{ 1 rows}} However, when I query through the index, I get an {{ArithmeticError}} for divide by zero. {}0: jdbc:phoenix:localhost> select /*+ INDEX(TSALT TSALTIDX) */ * from TSALT;{}}}{\{{}Caused by: java.lang.ArithmeticException: / by zero}} {{ at org.apache.phoenix.schema.SaltingUtil.getSaltingByte(SaltingUtil.java:79)}} {{ at org.apache.phoenix.index.IndexMaintainer.buildDataRowKey(IndexMaintainer.java:916)}} {{ at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:253)}} {{ at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:274)}} {{ at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.next(UncoveredIndexRegionScanner.java:382)}} {{ at org.apache.phoenix.coprocessor.BaseRegionScanner.nextRaw(BaseRegionScanner.java:56)}} {{ at org.apache.phoenix.iterate.RegionScannerFactory$
[jira] [Updated] (PHOENIX-7238) Zero is accepted for SALT_BUCKETS, but queries fail
[ https://issues.apache.org/jira/browse/PHOENIX-7238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara updated PHOENIX-7238: --- Description: I have not done extensive testing on it, but when I specified {{SALT_BUCKETS=0}} while creating an index, I get no error and this is a valid use case to disable salting on index when the data table is salted: {{create table tsalt (k INTEGER PRIMARY KEY, v1 INTEGER) SALT_BUCKETS=4;}} {{upsert into tsalt (k, v1) VALUES (1, 100);}} {{create uncovered index tsaltidx on tsalt (PHOENIX_ROW_TIMESTAMP()) SALT_BUCKETS=0;}} >From schema and hbase regions, it is correctly treated as no salting scenario. {{{}0: jdbc:phoenix:localhost> select salt_buckets from system.catalog where table_name = 'TSALTIDX' and salt_buckets is not null; +-{-}{-}+ |SALT_BUCKETS| +-+ \{+}{}}}{{{}--\{+}{}}}{{{}hbase:001:0> list_regions 'TSALTIDX'{}}} {{ SERVER_NAME | REGION_NAME | START_KEY | END_KEY | SIZE | REQ | LOCALITY |}} {{ - | - | -- | -- | - | - | -- |}} {{ localhost,16020,1708958003582 | TSALTIDX,,1708958225506.a72b20c15cecba23289a03cd6956ec15. | | | 0 | 3 | 0.0 |}} {{ 1 rows}} However, when I query through the index, I get an {{ArithmeticError}} for divide by zero. {}0: jdbc:phoenix:localhost> select /*+ INDEX(TSALT TSALTIDX) */ * from TSALT;{}}}{\{{}Caused by: java.lang.ArithmeticException: / by zero}} {{ at org.apache.phoenix.schema.SaltingUtil.getSaltingByte(SaltingUtil.java:79)}} {{ at org.apache.phoenix.index.IndexMaintainer.buildDataRowKey(IndexMaintainer.java:916)}} {{ at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:253)}} {{ at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:274)}} {{ at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.next(UncoveredIndexRegionScanner.java:382)}} {{ at org.apache.phoenix.coprocessor.BaseRegionScanner.nextRaw(BaseRegionScanner.java:56)}} {{ at org.apache.phoenix.iterate.RegionScannerFactory$1.nextRaw(RegionScannerFactory.java:257)}} My suspicion is that table cells have number buckets stored as zero, so PTableImpl for the index gets constructed to return 0 from {{getBucketNum()}} and this is causing the divide by 0 error. was: I have not done extensive testing on it, but when I specified {{SALT_BUCKETS=0}} while creating an index, I get no error and this is a valid use case to disable salting on index when the data table is salted: {{create table tsalt (k INTEGER PRIMARY KEY, v1 INTEGER) SALT_BUCKETS=4;}} {{upsert into tsalt (k, v1) VALUES (1, 100);}} {{create uncovered index tsaltidx on tsalt (PHOENIX_ROW_TIMESTAMP()) SALT_BUCKETS=0;}} >From schema and hbase regions, it is correctly treated as no salting scenario. {\{{}0: jdbc:phoenix:localhost> select salt_buckets from system.catalog where table_name = 'TSALTIDX' and salt_buckets is not null; +--+ |SALT_BUCKETS| +--+ {+}--{+}{}}}\{{{}hbase:001:0> list_regions 'TSALTIDX' SERVER_NAME | REGION_NAME | START_KEY | END_KEY | SIZE | REQ | LOCALITY | - | - | -- | -- | - | - | -- | localhost,16020,1708958003582 | TSALTIDX,,1708958225506.a72b20c15cecba23289a03cd6956ec15. | | | 0 | 3 | 0.0 | 1 rows{}}} However, when I query through the index, I get an {{ArithmeticError}} for divide by zero. {{{}0: jdbc:phoenix:localhost> select /*+ INDEX(TSALT TSALTIDX) */ * from TSALT;{}}}{\{{}Caused by: java.lang.ArithmeticException: / by zero at org.apache.phoenix.schema.SaltingUtil.getSaltingByte(SaltingUtil.java:79) at org.apache.phoenix.index.IndexMaintainer.buildDataRowKey(IndexMaintainer.java:916) at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:253) at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:274) at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.next(UncoveredIndexRegionScanner.java:382) at org.apache.phoenix.coprocessor.BaseRegionScanner.nextRaw(BaseRegionScanner.java:56) at org.apache.phoenix.iterate.RegionScannerFactory$1.nextRaw(RegionScannerFactory.java:257){}}} My suspicion is that table
[jira] [Updated] (PHOENIX-7238) Zero is accepted for SALT_BUCKETS, but queries fail
[ https://issues.apache.org/jira/browse/PHOENIX-7238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara updated PHOENIX-7238: --- Description: I have not done extensive testing on it, but when I specified {{SALT_BUCKETS=0}} while creating an index, I get no error and this is a valid use case to disable salting on index when the data table is salted: {{create table tsalt (k INTEGER PRIMARY KEY, v1 INTEGER) SALT_BUCKETS=4;}} {{upsert into tsalt (k, v1) VALUES (1, 100);}} {{create uncovered index tsaltidx on tsalt (PHOENIX_ROW_TIMESTAMP()) SALT_BUCKETS=0;}} >From schema and hbase regions, it is correctly treated as no salting scenario. {\{{}0: jdbc:phoenix:localhost> select salt_buckets from system.catalog where table_name = 'TSALTIDX' and salt_buckets is not null; +--+ |SALT_BUCKETS| +--+ {+}--{+}{}}}\{{{}hbase:001:0> list_regions 'TSALTIDX' SERVER_NAME | REGION_NAME | START_KEY | END_KEY | SIZE | REQ | LOCALITY | - | - | -- | -- | - | - | -- | localhost,16020,1708958003582 | TSALTIDX,,1708958225506.a72b20c15cecba23289a03cd6956ec15. | | | 0 | 3 | 0.0 | 1 rows{}}} However, when I query through the index, I get an {{ArithmeticError}} for divide by zero. {{{}0: jdbc:phoenix:localhost> select /*+ INDEX(TSALT TSALTIDX) */ * from TSALT;{}}}{\{{}Caused by: java.lang.ArithmeticException: / by zero at org.apache.phoenix.schema.SaltingUtil.getSaltingByte(SaltingUtil.java:79) at org.apache.phoenix.index.IndexMaintainer.buildDataRowKey(IndexMaintainer.java:916) at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:253) at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:274) at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.next(UncoveredIndexRegionScanner.java:382) at org.apache.phoenix.coprocessor.BaseRegionScanner.nextRaw(BaseRegionScanner.java:56) at org.apache.phoenix.iterate.RegionScannerFactory$1.nextRaw(RegionScannerFactory.java:257){}}} My suspicion is that table cells have number buckets stored as zero, so PTableImpl for the index gets constructed to return 0 from {{getBucketNum()}} and this is causing the divide by 0 error. was: I have not done extensive testing on it, but when I specified {{SALT_BUCKETS=0}} while creating an index, I get no error and this is a valid use case to disable salting on index when the data table is salted: {{create table tsalt (k INTEGER PRIMARY KEY, v1 INTEGER) SALT_BUCKETS=4;}} {{upsert into tsalt (k, v1) VALUES (1, 100);}} {{create uncovered index tsaltidx on tsalt (PHOENIX_ROW_TIMESTAMP()) SALT_BUCKETS=0;}} >From schema and hbase regions, it appears to be treated like no salting. {{{}0: jdbc:phoenix:localhost> select salt_buckets from system.catalog where table_name = 'TSALTIDX' and salt_buckets is not null; +--+ | SALT_BUCKETS | +--+ +--+{}}}{{{}hbase:001:0> list_regions 'TSALTIDX' SERVER_NAME | REGION_NAME | START_KEY | END_KEY | SIZE | REQ | LOCALITY | - | - | -- | -- | - | - | -- | localhost,16020,1708958003582 | TSALTIDX,,1708958225506.a72b20c15cecba23289a03cd6956ec15. | | | 0 | 3 | 0.0 | 1 rows{}}} However, when I query through the index, I get an {{ArithmeticError}} for divide by zero. {{{}0: jdbc:phoenix:localhost> select /*+ INDEX(TSALT TSALTIDX) */ * from TSALT;{}}}{{{}Caused by: java.lang.ArithmeticException: / by zero at org.apache.phoenix.schema.SaltingUtil.getSaltingByte(SaltingUtil.java:79) at org.apache.phoenix.index.IndexMaintainer.buildDataRowKey(IndexMaintainer.java:916) at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:253) at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:274) at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.next(UncoveredIndexRegionScanner.java:382) at org.apache.phoenix.coprocessor.BaseRegionScanner.nextRaw(BaseRegionScanner.java:56) at org.apache.phoenix.iterate.RegionScannerFactory$1.nextRaw(RegionScannerFactory.java:257){}}} My suspicion is that table cells have number buckets stored as zero, so PTableImpl for the index g
[jira] [Created] (PHOENIX-7238) Zero is accepted for SALT_BUCKETS, but queries fail
Hari Krishna Dara created PHOENIX-7238: -- Summary: Zero is accepted for SALT_BUCKETS, but queries fail Key: PHOENIX-7238 URL: https://issues.apache.org/jira/browse/PHOENIX-7238 Project: Phoenix Issue Type: Bug Reporter: Hari Krishna Dara I have not done extensive testing on it, but when I specified {{SALT_BUCKETS=0}} while creating an index, I get no error and this is a valid use case to disable salting on index when the data table is salted: {{create table tsalt (k INTEGER PRIMARY KEY, v1 INTEGER) SALT_BUCKETS=4;}} {{upsert into tsalt (k, v1) VALUES (1, 100);}} {{create uncovered index tsaltidx on tsalt (PHOENIX_ROW_TIMESTAMP()) SALT_BUCKETS=0;}} >From schema and hbase regions, it appears to be treated like no salting. {{{}0: jdbc:phoenix:localhost> select salt_buckets from system.catalog where table_name = 'TSALTIDX' and salt_buckets is not null; +--+ | SALT_BUCKETS | +--+ +--+{}}}{{{}hbase:001:0> list_regions 'TSALTIDX' SERVER_NAME | REGION_NAME | START_KEY | END_KEY | SIZE | REQ | LOCALITY | - | - | -- | -- | - | - | -- | localhost,16020,1708958003582 | TSALTIDX,,1708958225506.a72b20c15cecba23289a03cd6956ec15. | | | 0 | 3 | 0.0 | 1 rows{}}} However, when I query through the index, I get an {{ArithmeticError}} for divide by zero. {{{}0: jdbc:phoenix:localhost> select /*+ INDEX(TSALT TSALTIDX) */ * from TSALT;{}}}{{{}Caused by: java.lang.ArithmeticException: / by zero at org.apache.phoenix.schema.SaltingUtil.getSaltingByte(SaltingUtil.java:79) at org.apache.phoenix.index.IndexMaintainer.buildDataRowKey(IndexMaintainer.java:916) at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:253) at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.scanIndexTableRows(UncoveredIndexRegionScanner.java:274) at org.apache.phoenix.coprocessor.UncoveredIndexRegionScanner.next(UncoveredIndexRegionScanner.java:382) at org.apache.phoenix.coprocessor.BaseRegionScanner.nextRaw(BaseRegionScanner.java:56) at org.apache.phoenix.iterate.RegionScannerFactory$1.nextRaw(RegionScannerFactory.java:257){}}} My suspicion is that table cells have number buckets stored as zero, so PTableImpl for the index gets constructed to return 0 from {{getBucketNum()}} and this is causing the divide by 0 error. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PHOENIX-7013) CDC DQL Select query parser
[ https://issues.apache.org/jira/browse/PHOENIX-7013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara resolved PHOENIX-7013. Resolution: Fixed PR: [https://github.com/apache/phoenix/pull/1766] Change has been merged into the feature branch. > CDC DQL Select query parser > --- > > Key: PHOENIX-7013 > URL: https://issues.apache.org/jira/browse/PHOENIX-7013 > Project: Phoenix > Issue Type: Sub-task >Reporter: Viraj Jasani > Assignee: Hari Krishna Dara >Priority: Major > > The purpose of this sub-task is to provide DQL query capability for CDC > (Change Data Capture) feature. > The SELECT query parser can identify the given CDC table based on the table > type defined in SYSTEM.CATALOG and it should be able to parse qualifiers (pre > | post | latest | all) from the query. > CDC DQL query sample: > > {code:java} > Select * from where PHOENIX_ROW_TIMESTAMP() >= TO_DATE( …) > AND PHOENIX_ROW_TIMESTAMP() < TO_DATE( …) > {code} > This query would return the rows of the CDC table. The above select query can > be hinted at by using a new CDC hint to return just the actual change, pre, > post, or latest image of the row, or a combination of them. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-7013) CDC DQL Select query parser
[ https://issues.apache.org/jira/browse/PHOENIX-7013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara reassigned PHOENIX-7013: -- Assignee: Hari Krishna Dara > CDC DQL Select query parser > --- > > Key: PHOENIX-7013 > URL: https://issues.apache.org/jira/browse/PHOENIX-7013 > Project: Phoenix > Issue Type: Sub-task >Reporter: Viraj Jasani > Assignee: Hari Krishna Dara >Priority: Major > > The purpose of this sub-task is to provide DQL query capability for CDC > (Change Data Capture) feature. > The SELECT query parser can identify the given CDC table based on the table > type defined in SYSTEM.CATALOG and it should be able to parse qualifiers (pre > | post | latest | all) from the query. > CDC DQL query sample: > > {code:java} > Select * from where PHOENIX_ROW_TIMESTAMP() >= TO_DATE( …) > AND PHOENIX_ROW_TIMESTAMP() < TO_DATE( …) > {code} > This query would return the rows of the CDC table. The above select query can > be hinted at by using a new CDC hint to return just the actual change, pre, > post, or latest image of the row, or a combination of them. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PHOENIX-7014) CDC query complier and optimizer
[ https://issues.apache.org/jira/browse/PHOENIX-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara resolved PHOENIX-7014. Resolution: Fixed PR: [https://github.com/apache/phoenix/pull/1766] Merged into the feature branch. > CDC query complier and optimizer > > > Key: PHOENIX-7014 > URL: https://issues.apache.org/jira/browse/PHOENIX-7014 > Project: Phoenix > Issue Type: Sub-task >Reporter: Viraj Jasani > Assignee: Hari Krishna Dara >Priority: Major > > For CDC table type, the query optimizer should be able to query from the > uncovered global index table with data table associated with the given CDC > table. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (PHOENIX-7001) Change Data Capture leveraging Max Lookback and Uncovered Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara reopened PHOENIX-7001: Resolved wrong item. > Change Data Capture leveraging Max Lookback and Uncovered Indexes > - > > Key: PHOENIX-7001 > URL: https://issues.apache.org/jira/browse/PHOENIX-7001 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Priority: Major > > The use cases for a Change Data Capture (CDC) feature are centered around > capturing changes to a given table (or updatable view) as these changes > happen in near real-time. A CDC application can retrieve changes in real-time > or with some delay, or even retrieves the same set of changes multiple times. > This means the CDC use case can be generalized as time range queries where > the time range is typically short such as last x minutes or hours or > expressed as a specific time range in the last n days where n is typically > less than 7. > A change is an update in a row. That is, a change is either updating one or > more columns of a table for a given row or deleting a row. It is desirable to > provide these changes in the order of their arrival. One can visualize the > delivery of these changes through a stream from a Phoenix table to the > application that is initiated by the application similar to the delivery of > any other Phoenix query results. The difference is that a regular query > result includes at most one result row for each row satisfying the query and > the deleted rows are not visible to the query result while the CDC > stream/result can include multiple result rows for each row and the result > includes deleted rows. Some use cases need to also get the pre and/or post > image of the row along with a change on the row. > The design proposed here leverages Phoenix Max Lookback and Uncovered (Global > or Local) Indexes. The max lookback feature retains recent changes to a > table, that is, the changes that have been done in the last x days typically. > This means that the max lookback feature already captures the changes to a > given table. Currently, the max lookback age is configurable at the cluster > level. We need to extend this capability to be able to configure the max > lookback age at the table level so that each table can have a different max > lookback age based on its CDC application requirements. > To deliver the changes in the order of their arrival, we need a time based > index. This index should be uncovered as the changes are already retained in > the table by the max lookback feature. The arrival time can be defined as the > mutation timestamp generated by the server, or a user-specified timestamp (or > any other long integer) column. An uncovered index would allow us to > efficiently and orderly access to the changes. Changes to an index table are > also preserved by the max lookback feature. > A CDC feature can be composed of the following components: > * {*}CDCUncoveredIndexRegionScanner{*}: This is a server side scanner on an > uncovered index used for CDC. This can inherit UncoveredIndexRegionScanner. > It goes through index table rows using a raw scan to identify data table rows > and retrieves these rows using a raw scan. Using the time range, it forms a > JSON blob to represent changes to the row including pre and/or post row > images. > * {*}CDC Query Compiler{*}: This is a client side component. It prepares the > scan object based on the given CDC query statement. > * {*}CDC DDL Compiler{*}: This is a client side component. It creates the > time based uncovered (global/local) index based on the given CDC DDL > statement and a virtual table of CDC type. CDC will be a new table type. > A CDC DDL syntax to create CDC on a (data) table can be as follows: > Create CDC on (PHOENIX_ROW_TIMESTAMP() | > ) INCLUDE (pre | post | latest | all) TTL = seconds> INDEX = SALT_BUCKETS= > The above CDC DDL creates a virtual CDC table and an uncovered index. The CDC > table PK columns start with the timestamp or user defined column and continue > with the data table PK columns. The CDC table includes one non-PK column > which is a JSON column. The change is expressed in this JSON column in > multiple ways based on the CDC DDL or query statement. The change can be > expressed as just the mutation for the change, the latest image of the row, > the pre image of the row (the image before the change), the post image, or > any combination of these. The CDC table is not a physical table on disk. It > is ju
[jira] [Resolved] (PHOENIX-7001) Change Data Capture leveraging Max Lookback and Uncovered Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara resolved PHOENIX-7001. Resolution: Fixed Change merged into the feature branch. > Change Data Capture leveraging Max Lookback and Uncovered Indexes > - > > Key: PHOENIX-7001 > URL: https://issues.apache.org/jira/browse/PHOENIX-7001 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Priority: Major > > The use cases for a Change Data Capture (CDC) feature are centered around > capturing changes to a given table (or updatable view) as these changes > happen in near real-time. A CDC application can retrieve changes in real-time > or with some delay, or even retrieves the same set of changes multiple times. > This means the CDC use case can be generalized as time range queries where > the time range is typically short such as last x minutes or hours or > expressed as a specific time range in the last n days where n is typically > less than 7. > A change is an update in a row. That is, a change is either updating one or > more columns of a table for a given row or deleting a row. It is desirable to > provide these changes in the order of their arrival. One can visualize the > delivery of these changes through a stream from a Phoenix table to the > application that is initiated by the application similar to the delivery of > any other Phoenix query results. The difference is that a regular query > result includes at most one result row for each row satisfying the query and > the deleted rows are not visible to the query result while the CDC > stream/result can include multiple result rows for each row and the result > includes deleted rows. Some use cases need to also get the pre and/or post > image of the row along with a change on the row. > The design proposed here leverages Phoenix Max Lookback and Uncovered (Global > or Local) Indexes. The max lookback feature retains recent changes to a > table, that is, the changes that have been done in the last x days typically. > This means that the max lookback feature already captures the changes to a > given table. Currently, the max lookback age is configurable at the cluster > level. We need to extend this capability to be able to configure the max > lookback age at the table level so that each table can have a different max > lookback age based on its CDC application requirements. > To deliver the changes in the order of their arrival, we need a time based > index. This index should be uncovered as the changes are already retained in > the table by the max lookback feature. The arrival time can be defined as the > mutation timestamp generated by the server, or a user-specified timestamp (or > any other long integer) column. An uncovered index would allow us to > efficiently and orderly access to the changes. Changes to an index table are > also preserved by the max lookback feature. > A CDC feature can be composed of the following components: > * {*}CDCUncoveredIndexRegionScanner{*}: This is a server side scanner on an > uncovered index used for CDC. This can inherit UncoveredIndexRegionScanner. > It goes through index table rows using a raw scan to identify data table rows > and retrieves these rows using a raw scan. Using the time range, it forms a > JSON blob to represent changes to the row including pre and/or post row > images. > * {*}CDC Query Compiler{*}: This is a client side component. It prepares the > scan object based on the given CDC query statement. > * {*}CDC DDL Compiler{*}: This is a client side component. It creates the > time based uncovered (global/local) index based on the given CDC DDL > statement and a virtual table of CDC type. CDC will be a new table type. > A CDC DDL syntax to create CDC on a (data) table can be as follows: > Create CDC on (PHOENIX_ROW_TIMESTAMP() | > ) INCLUDE (pre | post | latest | all) TTL = seconds> INDEX = SALT_BUCKETS= > The above CDC DDL creates a virtual CDC table and an uncovered index. The CDC > table PK columns start with the timestamp or user defined column and continue > with the data table PK columns. The CDC table includes one non-PK column > which is a JSON column. The change is expressed in this JSON column in > multiple ways based on the CDC DDL or query statement. The change can be > expressed as just the mutation for the change, the latest image of the row, > the pre image of the row (the image before the change), the post image, or > any combination of these. The CDC table i
[jira] [Created] (PHOENIX-7154) SELECT query with undefined column on an UNCOVERED INDEX results in StringIndexOutOfBoundsException
Hari Krishna Dara created PHOENIX-7154: -- Summary: SELECT query with undefined column on an UNCOVERED INDEX results in StringIndexOutOfBoundsException Key: PHOENIX-7154 URL: https://issues.apache.org/jira/browse/PHOENIX-7154 Project: Phoenix Issue Type: Bug Affects Versions: 5.1.4 Reporter: Hari Krishna Dara If you run a SELECT query directly on an uncovered index with a column name that is undefined for that index, you get a {{{}java.lang.StringIndexOutOfBoundsException{}}}. In the below sample, you can see that using a valid column, the query worked fine, but an undefined column caused the exception. {{0: jdbc:phoenix:localhost> create table t (k INTEGER PRIMARY KEY, v1 INTEGER);}} {{No rows affected (0.64 seconds)}} {{0: jdbc:phoenix:localhost> create uncovered index tuidx on t (PHOENIX_ROW_TIMESTAMP());}} {{No rows affected (5.671 seconds)}} {{0: jdbc:phoenix:localhost> select abc from tuidx;}} {{java.lang.StringIndexOutOfBoundsException: String index out of range: -1}} {{ at java.lang.String.substring(String.java:1967)}} {{ at org.apache.phoenix.util.IndexUtil.getDataColumnFamilyName(IndexUtil.java:200)}} {{ at org.apache.phoenix.schema.IndexUncoveredDataColumnRef.(IndexUncoveredDataColumnRef.java:51)}} {{ at org.apache.phoenix.compile.TupleProjectionCompiler$ColumnRefVisitor.visit(TupleProjectionCompiler.java:269)}} {{ at org.apache.phoenix.compile.TupleProjectionCompiler$ColumnRefVisitor.visit(TupleProjectionCompiler.java:245)}} {{ at org.apache.phoenix.parse.ColumnParseNode.accept(ColumnParseNode.java:56)}} {{ at org.apache.phoenix.compile.TupleProjectionCompiler.createProjectedTable(TupleProjectionCompiler.java:127)}} {{ at org.apache.phoenix.compile.QueryCompiler.compileSingleFlatQuery(QueryCompiler.java:701)}} {{ at org.apache.phoenix.compile.QueryCompiler.compileSingleQuery(QueryCompiler.java:667)}} {{ at org.apache.phoenix.compile.QueryCompiler.compileSelect(QueryCompiler.java:249)}} {{ at org.apache.phoenix.compile.QueryCompiler.compile(QueryCompiler.java:181)}} {{ at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:724)}} {{ at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:687)}} {{ at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:368)}} {{ at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:349)}} {{ at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)}} {{ at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:349)}} {{ at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:335)}} {{ at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:2362)}} {{ at sqlline.Commands.executeSingleQuery(Commands.java:1054)}} {{ at sqlline.Commands.execute(Commands.java:1003)}} {{ at sqlline.Commands.sql(Commands.java:967)}} {{ at sqlline.SqlLine.dispatch(SqlLine.java:734)}} {{ at sqlline.SqlLine.begin(SqlLine.java:541)}} {{ at sqlline.SqlLine.start(SqlLine.java:267)}} {{ at sqlline.SqlLine.main(SqlLine.java:206)}} {{0: jdbc:phoenix:localhost> select ":K" from tuidx;}} {{++}} {{| :K |}} {{++}} {{++}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-7014) CDC query complier and optimizer
[ https://issues.apache.org/jira/browse/PHOENIX-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara reassigned PHOENIX-7014: -- Assignee: Hari Krishna Dara > CDC query complier and optimizer > > > Key: PHOENIX-7014 > URL: https://issues.apache.org/jira/browse/PHOENIX-7014 > Project: Phoenix > Issue Type: Sub-task >Reporter: Viraj Jasani > Assignee: Hari Krishna Dara >Priority: Major > > For CDC table type, the query optimizer should be able to query from the > uncovered global index table with data table associated with the given CDC > table. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7055) Usage improvements for sqline.py
Hari Krishna Dara created PHOENIX-7055: -- Summary: Usage improvements for sqline.py Key: PHOENIX-7055 URL: https://issues.apache.org/jira/browse/PHOENIX-7055 Project: Phoenix Issue Type: Improvement Reporter: Hari Krishna Dara Assignee: Hari Krishna Dara A few small improvements to make the usage of this tool easier: * It should be possible to start sqline without making a connection. This useful to open a custom connection from the prompt and also simply browse through the history. * Start in debug mode so that we can connect from a debug client. * Fix bugs in the existing boolean option interpretations -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-6977) when i am running apache phoenix hello world program in eclipse ide it is working fine But when run netbens ide i am error.
sai krishna created PHOENIX-6977: Summary: when i am running apache phoenix hello world program in eclipse ide it is working fine But when run netbens ide i am error. Key: PHOENIX-6977 URL: https://issues.apache.org/jira/browse/PHOENIX-6977 Project: Phoenix Issue Type: Bug Reporter: sai krishna when i am running apache phoenix hello world program in eclipse ide it is working fine But when run netbens ide i am getting below error . Exception in thread "main" org.apache.phoenix.exception.PhoenixIOException: Can't find method newStub in org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService! at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:138) at org.apache.phoenix.query.ConnectionQueryServicesImpl.checkClientServerCompatibility(ConnectionQueryServicesImpl.java:1652) at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1462) at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1913) at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:3065) at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:1105) at org.apache.phoenix.compile.CreateTableCompiler$CreateTableMutationPlan.execute(CreateTableCompiler.java:420) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:415) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:397) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:396) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:384) at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1906) at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3267) at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3230) at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76) at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3230) at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255) at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:144) at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:677) at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:251) at com.mycompany.mavenproject1.test.main(test.java:20) Caused by: java.lang.IllegalArgumentException: Can't find method newStub in org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService! at org.apache.hadoop.hbase.util.Methods.call(Methods.java:49) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.newServiceStub(ProtobufUtil.java:1537) at org.apache.hadoop.hbase.client.HTable$12.call(HTable.java:1012) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: java.lang.NoSuchMethodException: org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.newStub(com.google.protobuf.RpcChannel) at java.base/java.lang.Class.getMethod(Class.java:2108) at org.apache.hadoop.hbase.util.Methods.call(Methods.java:41) ... 6 more Command execution failed. org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1) at org.apache.commons.exec.DefaultExecutor.executeInternal (DefaultExecutor.java:404) at org.apache.commons.exec.DefaultExecutor.execute (DefaultExecutor.java:166) at org.codehaus.mojo.exec.ExecMojo.executeCommandLine (ExecMojo.java:1000) at org.codehaus.mojo.exec.ExecMojo.executeCommandLine (ExecMojo.java:947) at org.codehaus.mojo.exec.ExecMojo.execute (ExecMojo.java:471) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:126) at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute2 (MojoExecutor.java:342) at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:330) at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:213) at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:175) at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExec
[jira] [Created] (PHOENIX-5306) Misleading statement in document
Krishna Maheshwari created PHOENIX-5306: --- Summary: Misleading statement in document Key: PHOENIX-5306 URL: https://issues.apache.org/jira/browse/PHOENIX-5306 Project: Phoenix Issue Type: Bug Reporter: Krishna Maheshwari [https://svn.apache.org/repos/asf/phoenix/site/source/src/site/markdown/views.md] has the following misleading statement as HBase scaling is not limited by number of tables but rather number of overall regions. "The standard SQL view syntax (with some limitations) is now supported by Phoenix to enable multiple virtual tables to all share the same underlying physical HBase table. This is especially important in HBase, as you cannot realistically expect to have more than perhaps up to a hundred physical tables and continue to get reasonable performance from HBase." This should be revised to state: "The standard SQL view syntax (with some limitations) is now supported by Phoenix to enable multiple virtual tables to all share the same underlying physical HBase table." -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (PHOENIX-4761) java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory
[ https://issues.apache.org/jira/browse/PHOENIX-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497646#comment-16497646 ] Hari Krishna Dara edited comment on PHOENIX-4761 at 6/1/18 6:51 AM: I looked at the server configuration and it seems that Disruptor is not part of the HBase server dependencies in 0.98 (yes, we are still using this version): https://github.com/apache/hbase/blob/0.98/hbase-server/pom.xml Where as the newer HBase versions have it, e.g., see: https://github.com/apache/hbase/blob/branch-1.2/hbase-server/pom.xml#L557 was (Author: haridsv): I looked at the server configuration and it seems that Disruptor is not part of the server dependencies in 0.98 (yes, we are still using this version): https://github.com/apache/hbase/blob/0.98/hbase-server/pom.xml Where as the newer versions have it, e.g., see: https://github.com/apache/hbase/blob/branch-1.2/hbase-server/pom.xml#L557 > java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory > --- > > Key: PHOENIX-4761 > URL: https://issues.apache.org/jira/browse/PHOENIX-4761 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0 > Reporter: Hari Krishna Dara >Priority: Major > Attachments: PHOENIX-4761.patch > > > There was a recent additional dependency on this 3rd party library, but it is > not made available at runtime via the assembly, so I am seeing the below > exception: > {noformat} > Caused by: java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.(ConnectionQueryServicesImpl.java:414) > at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:248) > at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:241) > at > com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4796) > at > com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589) > at > com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374) > at > com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337) > at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252) > at com.google.common.cache.LocalCache.get(LocalCache.java:3990) > at > com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4793) > at > org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:241) > at > org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150) > at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) > at java.sql.DriverManager.getConnection(DriverManager.java:664) > at java.sql.DriverManager.getConnection(DriverManager.java:270) > {noformat} > The -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4761) java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory
[ https://issues.apache.org/jira/browse/PHOENIX-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497646#comment-16497646 ] Hari Krishna Dara commented on PHOENIX-4761: I looked at the server configuration and it seems that Disruptor is not part of the server dependencies in 0.98 (yes, we are still using this version): https://github.com/apache/hbase/blob/0.98/hbase-server/pom.xml Where as the newer versions have it, e.g., see: https://github.com/apache/hbase/blob/branch-1.2/hbase-server/pom.xml#L557 > java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory > --- > > Key: PHOENIX-4761 > URL: https://issues.apache.org/jira/browse/PHOENIX-4761 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0 > Reporter: Hari Krishna Dara >Priority: Major > Attachments: PHOENIX-4761.patch > > > There was a recent additional dependency on this 3rd party library, but it is > not made available at runtime via the assembly, so I am seeing the below > exception: > {noformat} > Caused by: java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.(ConnectionQueryServicesImpl.java:414) > at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:248) > at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:241) > at > com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4796) > at > com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589) > at > com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374) > at > com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337) > at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252) > at com.google.common.cache.LocalCache.get(LocalCache.java:3990) > at > com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4793) > at > org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:241) > at > org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150) > at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) > at java.sql.DriverManager.getConnection(DriverManager.java:664) > at java.sql.DriverManager.getConnection(DriverManager.java:270) > {noformat} > The -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4761) java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory
[ https://issues.apache.org/jira/browse/PHOENIX-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497631#comment-16497631 ] Hari Krishna Dara commented on PHOENIX-4761: Looks like I can't change the status or resolution, but this can be marked invalid of the equivalent. > java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory > --- > > Key: PHOENIX-4761 > URL: https://issues.apache.org/jira/browse/PHOENIX-4761 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0 >Reporter: Hari Krishna Dara >Priority: Major > Attachments: PHOENIX-4761.patch > > > There was a recent additional dependency on this 3rd party library, but it is > not made available at runtime via the assembly, so I am seeing the below > exception: > {noformat} > Caused by: java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.(ConnectionQueryServicesImpl.java:414) > at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:248) > at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:241) > at > com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4796) > at > com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589) > at > com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374) > at > com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337) > at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252) > at com.google.common.cache.LocalCache.get(LocalCache.java:3990) > at > com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4793) > at > org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:241) > at > org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150) > at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) > at java.sql.DriverManager.getConnection(DriverManager.java:664) > at java.sql.DriverManager.getConnection(DriverManager.java:270) > {noformat} > The -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4761) java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory
[ https://issues.apache.org/jira/browse/PHOENIX-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497630#comment-16497630 ] Hari Krishna Dara commented on PHOENIX-4761: Thanks [~elserj] and [~an...@apache.org] for your replies. Our clients are using phoenix-core instead of phoenix-client, so that explains the client side issue here. Our server side programs do have HBase classpath along with the sever jar, so I am not sure why it didn't find the Disruptor classes. It does look like an issue with our configuration than a Phoenix issue, so I will close this jira. > java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory > --- > > Key: PHOENIX-4761 > URL: https://issues.apache.org/jira/browse/PHOENIX-4761 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0 >Reporter: Hari Krishna Dara >Priority: Major > Attachments: PHOENIX-4761.patch > > > There was a recent additional dependency on this 3rd party library, but it is > not made available at runtime via the assembly, so I am seeing the below > exception: > {noformat} > Caused by: java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.(ConnectionQueryServicesImpl.java:414) > at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:248) > at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:241) > at > com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4796) > at > com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589) > at > com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374) > at > com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337) > at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252) > at com.google.common.cache.LocalCache.get(LocalCache.java:3990) > at > com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4793) > at > org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:241) > at > org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150) > at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) > at java.sql.DriverManager.getConnection(DriverManager.java:664) > at java.sql.DriverManager.getConnection(DriverManager.java:270) > {noformat} > The -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4761) java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory
[ https://issues.apache.org/jira/browse/PHOENIX-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496552#comment-16496552 ] Hari Krishna Dara commented on PHOENIX-4761: [~elserj]You are right, it is a client side exception, not sure why I updated the server jar, I think I misunderstood which jar the tool was loadin. It was running on the server and must have had both in the classpath as the fix worked (loaded the classes from server jar instead of the client jar). I just noticed that another one of the tests that runs on the client failed for the same classes, so yes, the fix was not right. I will come back with another. > java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory > --- > > Key: PHOENIX-4761 > URL: https://issues.apache.org/jira/browse/PHOENIX-4761 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0 > Reporter: Hari Krishna Dara >Priority: Major > Attachments: PHOENIX-4761.patch > > > There was a recent additional dependency on this 3rd party library, but it is > not made available at runtime via the assembly, so I am seeing the below > exception: > {noformat} > Caused by: java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.(ConnectionQueryServicesImpl.java:414) > at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:248) > at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:241) > at > com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4796) > at > com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589) > at > com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374) > at > com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337) > at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252) > at com.google.common.cache.LocalCache.get(LocalCache.java:3990) > at > com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4793) > at > org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:241) > at > org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150) > at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) > at java.sql.DriverManager.getConnection(DriverManager.java:664) > at java.sql.DriverManager.getConnection(DriverManager.java:270) > {noformat} > The -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4761) java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory
[ https://issues.apache.org/jira/browse/PHOENIX-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara updated PHOENIX-4761: --- Attachment: PHOENIX-4761.patch > java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory > --- > > Key: PHOENIX-4761 > URL: https://issues.apache.org/jira/browse/PHOENIX-4761 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0 > Reporter: Hari Krishna Dara >Priority: Major > Attachments: PHOENIX-4761.patch > > > There was a recent additional dependency on this 3rd party library, but it is > not made available at runtime via the assembly, so I am seeing the below > exception: > {noformat} > Caused by: java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.(ConnectionQueryServicesImpl.java:414) > at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:248) > at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:241) > at > com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4796) > at > com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589) > at > com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374) > at > com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337) > at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252) > at com.google.common.cache.LocalCache.get(LocalCache.java:3990) > at > com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4793) > at > org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:241) > at > org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150) > at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) > at java.sql.DriverManager.getConnection(DriverManager.java:664) > at java.sql.DriverManager.getConnection(DriverManager.java:270) > {noformat} > The -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PHOENIX-4761) java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory
Hari Krishna Dara created PHOENIX-4761: -- Summary: java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory Key: PHOENIX-4761 URL: https://issues.apache.org/jira/browse/PHOENIX-4761 Project: Phoenix Issue Type: Bug Affects Versions: 4.14.0 Reporter: Hari Krishna Dara There was a recent additional dependency on this 3rd party library, but it is not made available at runtime via the assembly, so I am seeing the below exception: {noformat} Caused by: java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory at org.apache.phoenix.query.ConnectionQueryServicesImpl.(ConnectionQueryServicesImpl.java:414) at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:248) at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:241) at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4796) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252) at com.google.common.cache.LocalCache.get(LocalCache.java:3990) at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4793) at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:241) at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150) at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:270) {noformat} The -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: UDF for lateral views
According to this blog ( http://phoenix-hbase.blogspot.in/2013/04/how-to-add-your-own-built-in-function.html), evaluate(...) is responsible for processing the input state of the row and filling up ImmutableBytesWritable pointer with transformed row. Did not find any references that'll support returning multiple rows for each input row. Does anyone know if UDF framework can support that? On Tue, Jan 16, 2018 at 6:07 PM, Krishna wrote: > I would like to convert a column of ARRAY data-type such that each element > of the array is returned as a row. Hive supports it via Lateral Views ( > https://cwiki.apache.org/confluence/display/Hive/ > LanguageManual+LateralView)? > > Does UDF framework in Phoenix allow for building such functions? >
UDF for lateral views
I would like to convert a column of ARRAY data-type such that each element of the array is returned as a row. Hive supports it via Lateral Views ( https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView )? Does UDF framework in Phoenix allow for building such functions?
[jira] [Updated] (PHOENIX-4040) Protect against NPEs in org.apache.phoenix.compile.DeleteCompiler.deleteRows
[ https://issues.apache.org/jira/browse/PHOENIX-4040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara updated PHOENIX-4040: --- Description: We are occasionally seeing the below NPE coming from Phoenix code. We don't currently have a repro case for this, but since it is an NPE, Phoenix code should protected against it. {noformat} org.apache.phoenix.exception.PhoenixIOException: java.lang.NullPointerException at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:113) at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:854) at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:798) at org.apache.phoenix.iterate.RoundRobinResultIterator.getIterators(RoundRobinResultIterator.java:176) at org.apache.phoenix.iterate.RoundRobinResultIterator.next(RoundRobinResultIterator.java:91) at org.apache.phoenix.compile.DeleteCompiler$3.execute(DeleteCompiler.java:668) at org.apache.phoenix.compile.DeleteCompiler$MultiDeleteMutationPlan.execute(DeleteCompiler.java:284) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:355) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:337) at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:251) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:172) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:177) at phoenix.connection.ProtectedPhoenixPreparedStatement.execute(ProtectedPhoenixPreparedStatement.java:74) ... Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:206) at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:849) ... 40 more Caused by: java.lang.NullPointerException at org.apache.phoenix.compile.DeleteCompiler.deleteRows(DeleteCompiler.java:105) at org.apache.phoenix.compile.DeleteCompiler.access$000(DeleteCompiler.java:93) at org.apache.phoenix.compile.DeleteCompiler$DeletingParallelIteratorFactory.mutate(DeleteCompiler.java:219) at org.apache.phoenix.compile.MutatingParallelIteratorFactory.newIterator(MutatingParallelIteratorFactory.java:59) at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:114) at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:106) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:183) ... 3 more {noformat} was: We are occasionally seeing the below NPE coming from Phoenix code. We don't currently have a repro case for this, but since it is an NPE, Phoenix code should protected against it. {{org.apache.phoenix.exception.PhoenixIOException: java.lang.NullPointerException at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:113) at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:854) at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:798) at org.apache.phoenix.iterate.RoundRobinResultIterator.getIterators(RoundRobinResultIterator.java:176) at org.apache.phoenix.iterate.RoundRobinResultIterator.next(RoundRobinResultIterator.java:91) at org.apache.phoenix.compile.DeleteCompiler$3.execute(DeleteCompiler.java:668) at org.apache.phoenix.compile.DeleteCompiler$MultiDeleteMutationPlan.execute(DeleteCompiler.java:284) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:355) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:337) at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:251) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:172) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:177) at phoenix.connection.ProtectedPhoenixPreparedStatement.execute(ProtectedPhoenixPreparedStatement.java:74) ... Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointe
[jira] [Created] (PHOENIX-4040) Protect against NPEs in org.apache.phoenix.compile.DeleteCompiler.deleteRows
Hari Krishna Dara created PHOENIX-4040: -- Summary: Protect against NPEs in org.apache.phoenix.compile.DeleteCompiler.deleteRows Key: PHOENIX-4040 URL: https://issues.apache.org/jira/browse/PHOENIX-4040 Project: Phoenix Issue Type: Bug Affects Versions: 4.10.0 Reporter: Hari Krishna Dara Priority: Minor We are occasionally seeing the below NPE coming from Phoenix code. We don't currently have a repro case for this, but since it is an NPE, Phoenix code should protected against it. {{org.apache.phoenix.exception.PhoenixIOException: java.lang.NullPointerException at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:113) at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:854) at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:798) at org.apache.phoenix.iterate.RoundRobinResultIterator.getIterators(RoundRobinResultIterator.java:176) at org.apache.phoenix.iterate.RoundRobinResultIterator.next(RoundRobinResultIterator.java:91) at org.apache.phoenix.compile.DeleteCompiler$3.execute(DeleteCompiler.java:668) at org.apache.phoenix.compile.DeleteCompiler$MultiDeleteMutationPlan.execute(DeleteCompiler.java:284) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:355) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:337) at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:251) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:172) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:177) at phoenix.connection.ProtectedPhoenixPreparedStatement.execute(ProtectedPhoenixPreparedStatement.java:74) ... Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:206) at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:849) ... 40 more Caused by: java.lang.NullPointerException at org.apache.phoenix.compile.DeleteCompiler.deleteRows(DeleteCompiler.java:105) at org.apache.phoenix.compile.DeleteCompiler.access$000(DeleteCompiler.java:93) at org.apache.phoenix.compile.DeleteCompiler$DeletingParallelIteratorFactory.mutate(DeleteCompiler.java:219) at org.apache.phoenix.compile.MutatingParallelIteratorFactory.newIterator(MutatingParallelIteratorFactory.java:59) at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:114) at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:106) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:183) ... 3 more}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: Sample phoenix upserts using threads
Hi Elser, Thanks for the quick response. Below is the exception that being logged in some region servers. In local it is working good when running in a distributed environment I am getting the below exception. Caused by: java.lang.AssertionError: we should never remove a different context at org.apache.hadoop.hbase.regionserver.HRegion$RowLockContext.cleanUp( HRegion.java:5227) at org.apache.hadoop.hbase.regionserver.HRegion$ RowLockImpl.release(HRegion.java:5272) at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable( MetaDataEndpointImpl.java:2489) at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable( MetaDataEndpointImpl.java:2426) at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable( MetaDataEndpointImpl.java:451) ... 10 more Phoenix : 4.7.0-HBase1.1 jar HBase: 1.2.2 Cluster : 1 master 4 cores OS : Amazon EMR 5.0.0 Thanks, Krishna. > On 01-Nov-2016, at 21:54, Josh Elser wrote: > > (cc: -dev +user, bcc: +dev) > > Hi Krishna, > > Might you be able to share the stacktrace that accompanied that Exception? > > Shiva Krishna wrote: >> Hi All, >> Can any one give me a small example of Phoenix upserts using Threads in Java. >> I wrote a sample it is working fine in local environment but when running it >> cluster it is failing with below error. >> java.lang.AssertionError: we should never remove a different context >> >> try(Connection conn = getConnection(); >> PreparedStatement statement = conn.prepareStatement("upsert into >> \”test\" values(?,?,?)");){ >> statement.setString(1,”test1”); >> statement.setString(2,”test2”); >> statement.setString(3,”test3”); >> statement.execute(); >> conn.commit(); >> }catch(SqlException ex) >> { >> ex.printStackTrace(); >> } >> Tried using both Threads and ForkJoins but getting the same exception some >> times we are not able to predict or generalise when this exception is >> occurring or how to resolve it. >> >> Thanks, >> Krishna.
Sample phoenix upserts using threads
Hi All, Can any one give me a small example of Phoenix upserts using Threads in Java. I wrote a sample it is working fine in local environment but when running it cluster it is failing with below error. java.lang.AssertionError: we should never remove a different context try(Connection conn = getConnection(); PreparedStatement statement = conn.prepareStatement("upsert into \”test\" values(?,?,?)");){ statement.setString(1,”test1”); statement.setString(2,”test2”); statement.setString(3,”test3”); statement.execute(); conn.commit(); }catch(SqlException ex) { ex.printStackTrace(); } Tried using both Threads and ForkJoins but getting the same exception some times we are not able to predict or generalise when this exception is occurring or how to resolve it. Thanks, Krishna.
Re: we should never remove a different context issue when doing a simple insert
Hbase Master logs: 2016-10-03 13:37:48,950 INFO [AM.ZK.Worker-pool2-t64] master.RegionStates: Transition {8ea153077a9750d342506849ba9a0e9c state=PENDING_OPEN, ts=1475501868940, server=ip-172-31-17-32.us-west-2.compute.internal,16020,1475498064563} to {8ea153077a9750d342506849ba9a0e9c state=OPENING, ts=1475501868950, server=ip-172-31-17-32.us-west-2.compute.internal,16020,1475498064563} 2016-10-03 13:37:49,085 INFO [AM.ZK.Worker-pool2-t67] master.RegionStates: Transition {8ea153077a9750d342506849ba9a0e9c state=OPENING, ts=1475501868950, server=ip-172-31-17-32.us-west-2.compute.internal,16020,1475498064563} to {8ea153077a9750d342506849ba9a0e9c state=OPEN, ts=1475501869085, server=ip-172-31-17-32.us-west-2.compute.internal,16020,1475498064563} 2016-10-03 13:40:20,400 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=417.43 KB, freeSize=395.89 MB, max=396.30 MB, blockCount=0, accesses=0, hits=0, hitRatio=0, cachingAccesses=0, cachingHits=0, cachingHitsRatio=0,evictions=389, evicted=0, evictedPerRun=0.0 2016-10-03 13:44:45,814 INFO [B.defaultRpcServer.handler=3,queue=0,port=16000] master.HMaster: Client=ec2-user//172.31.21.233 modify nodes 2016-10-03 13:44:46,156 INFO [ProcedureExecutor-1] util.FSTableDescriptors: Updated tableinfo=hdfs://ip-172-31-20-113.us-west-2.compute.internal:8020/user/hbase/data/default/nodes/.tabledesc/.tableinfo.000217 2016-10-03 13:44:46,468 INFO [ProcedureExecutor-1] procedure.MasterDDLOperationHelper: Bucketing regions by region server... 2016-10-03 13:44:46,471 INFO [ProcedureExecutor-1] procedure.MasterDDLOperationHelper: Reopening 5 regions on 4 region servers. Region Server 1 logs: 2016-10-03 13:42:17,705 INFO [MemStoreFlusher.0] regionserver.HRegion: Finished memstore flush of ~128.00 MB/134218312, currentsize=182.53 KB/186912 for region nodes-p_type-index,\x04\x00\x00,1475499172973.17a95c591e69416cdb8a5afca966ea69. in 1497ms, sequenceid=610564, compaction requested=false 2016-10-03 13:42:32,478 INFO [HBase-Metrics2-1] impl.MetricsSystemImpl: Stopping HBase metrics system... 2016-10-03 13:42:32,481 INFO [HBase-Metrics2-1] impl.MetricsSystemImpl: HBase metrics system stopped. 2016-10-03 13:42:32,982 INFO [HBase-Metrics2-1] impl.MetricsConfig: loaded properties from hadoop-metrics2-hbase.properties 2016-10-03 13:42:32,984 INFO [HBase-Metrics2-1] impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-10-03 13:42:32,984 INFO [HBase-Metrics2-1] impl.MetricsSystemImpl: HBase metrics system started 2016-10-03 13:42:50,831 INFO [MemStoreFlusher.1] regionserver.HRegion: Flushing 1/1 column families, memstore=128.00 MB 2016-10-03 13:42:51,585 INFO [MemStoreFlusher.0] regionserver.HRegion: Flushing 1/1 column families, memstore=128.00 MB Region Server 2 logs: 2016-10-03 13:42:44,404 ERROR [B.defaultRpcServer.handler=10,queue=1,port=16020] coprocessor.MetaDataEndpointImpl: getTable failed java.lang.AssertionError: we should never remove a different context at org.apache.hadoop.hbase.regionserver.HRegion$RowLockContext.cleanUp(HRegion.java:5227) at org.apache.hadoop.hbase.regionserver.HRegion$RowLockImpl.release(HRegion.java:5272) at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2489) Thanks, Krishna. > On 03-Oct-2016, at 21:46, Ted Yu wrote: > > Can you pastebin region server log around the time the assertion was raised > ? > > Thanks > > On Mon, Oct 3, 2016 at 9:12 AM, Shiva Krishna > wrote: > >> Hi All, >> when I am trying to insert the data into HBase through Phoenix. >> I am getting the below exception >> Caused by: java.lang.AssertionError: we should never remove a different >> context >> at org.apache.hadoop.hbase.regionserver.HRegion$RowLockContext.cleanUp( >> HRegion.java:5227) >> at org.apache.hadoop.hbase.regionserver.HRegion$ >> RowLockImpl.release(HRegion.java:5272) >> at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable( >> MetaDataEndpointImpl.java:2489) >> at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable( >> MetaDataEndpointImpl.java:2426) >> at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable( >> MetaDataEndpointImpl.java:451) >> ... 10 more >> >> Phoenix : 4.7.0-HBase1.1 jar >> HBase: 1.2.2 >> >> Cluster : 1 master 4 cores >> >> OS : Amazon EMR 5.0.0 >> >> Thanks, >> Krishna.
we should never remove a different context issue when doing a simple insert
Hi All, when I am trying to insert the data into HBase through Phoenix. I am getting the below exception Caused by: java.lang.AssertionError: we should never remove a different context at org.apache.hadoop.hbase.regionserver.HRegion$RowLockContext.cleanUp(HRegion.java:5227) at org.apache.hadoop.hbase.regionserver.HRegion$RowLockImpl.release(HRegion.java:5272) at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2489) at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2426) at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:451) ... 10 more Phoenix : 4.7.0-HBase1.1 jar HBase: 1.2.2 Cluster : 1 master 4 cores OS : Amazon EMR 5.0.0 Thanks, Krishna.
Guidence for phoenix calcite development
Hi All, I am using phoenix from the past 3 months (Just using for a project of mine). I see that you guys are actively working on Phoenix - Calcite integration. Can you please guide me for any documentation on to understand phoenix project and to start contributing. Thanks, Krishna.
Decode rowkey
Hi, Does Phoenix have API for converting a rowkey (made up of multiple columns) and in ImmutableBytesRow format to split into primary key columns? I am performing a scan directly from HBase and would like to convert the rowkey into column values. We used Phoenix standard JDBC API while writing to the table. Thanks
[jira] [Created] (PHOENIX-3144) Invoking org.apache.phoenix.mapreduce.CsvBulkLoadTool from phoenix-4.4.0.2.4.0.0-169-client.jar is not working properly
Radha Krishna G created PHOENIX-3144: Summary: Invoking org.apache.phoenix.mapreduce.CsvBulkLoadTool from phoenix-4.4.0.2.4.0.0-169-client.jar is not working properly Key: PHOENIX-3144 URL: https://issues.apache.org/jira/browse/PHOENIX-3144 Project: Phoenix Issue Type: Bug Affects Versions: 4.4.0 Reporter: Radha Krishna G Hi All, i am trying to load around 40 GB file using "org.apache.phoenix.mapreduce.CsvBulkLoadTool" but it is showing the below error message. INFO mapreduce.Job: Task Id : attempt_1469663368297_56967_m_42_0, Status : FAILED Error: java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: (startline 1) EOF reached before encapsulated token finished at org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:176) at org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.RuntimeException: java.io.IOException: (startline 1) EOF reached before encapsulated token finished at org.apache.commons.csv.CSVParser$1.getNextRecord(CSVParser.java:398) at org.apache.commons.csv.CSVParser$1.hasNext(CSVParser.java:407) at com.google.common.collect.Iterators.getNext(Iterators.java:890) at com.google.common.collect.Iterables.getFirst(Iterables.java:781) at org.apache.phoenix.mapreduce.CsvToKeyValueMapper$CsvLineParser.parse(CsvToKeyValueMapper.java:287) at org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:148) ... 9 more Caused by: java.io.IOException: (startline 1) EOF reached before encapsulated token finished at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:282) at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152) at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:450) at org.apache.commons.csv.CSVParser$1.getNextRecord(CSVParser.java:395) ... 14 more Note : I collected some sample records around(1000) form the same file and able to load using the same approach, but if i provide full file path its failing, is there any limitation in the input data(size/ number of records) using this approach. i am sure there is not data issue in the input file. Bellow Command i used HADOOP_CLASSPATH=/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar:/usr/hdp/current/hbase-client/conf hadoop jar phoenix-4.4.0.2.4.0.0-169-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool --table "Table_Name" --input "HDFS input file path" -d $'\034' -d $'\034' --> the field separator in the file is FS so we provided the explicitly I followed the steps from the url https://phoenix.apache.org/bulk_dataload.html The Same file i am able to load using the spark approach https://phoenix.apache.org/phoenix_spark.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Save dataframe to Phoenix
According Phoenix-Spark plugin docs, only SaveMode.Overwrite is supported for saving dataframes to Phoenix table. Are there any plans to support other save modes (append, ignore) anytime soon? Only having overwrite option makes it useful for a small number of use-cases.
Re: Announcing phoenix-for-cloudera 4.6.0
What Phoenix version is in the parcels for CDH5.5.1? Is there a way to extract jars from those parcels? On Sun, Jan 17, 2016 at 5:52 AM, Alex Ott wrote: > The parcels provided by Cloudera were updated to run on CDH 5.5. > I've installed it, but didn't run very complex tasks, but basic tasks > works fine. > > Krishna at "Fri, 15 Jan 2016 18:20:47 -0800" wrote: > K> Thanks Andrew. Are binaries available for CDH5.5.x? > > K> On Tue, Nov 3, 2015 at 9:10 AM, Andrew Purtell > wrote: > > K> Today I pushed a new branch '4.6-HBase-1.0-cdh5' and the tag > 'v4.6.0-cdh5.4.5' (58fcfa6) to https://github.com/chiastic-security/ > K> phoenix-for-cloudera. This is the Phoenix 4.6.0 release, modified > to build against CDH 5.4.5 and possibly (but not tested) > K> subsequent CDH releases. > K> > K> If you want release tarballs I built from this, get them here: > > K> Binaries > K> > K> > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-bin.tar.gz > K> > K> > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-bin.tar.gz.asc > (signature) > K> > K> > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-bin.tar.gz.md5 > (MD5 sum) > K> > K> > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-bin.tar.gz.sha > (SHA-1 sum) > > K> Source > K> > K> > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-src.tar.gz > > K> > K> > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-src.tar.gz.asc > (signature) > K> > K> > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-src.tar.gz.md5 > (MD5 sum) > K> > K> > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-src.tar.gz.sha > (SHA1-sum) > > K> Signed with my code signing key D5365CCD. > K> > K> The source and these binaries incorporate changes from the > Cloudera Labs fork of Phoenix (https://github.com/cloudera-labs/ > K> phoenix), licensed under the ASL v2, Neither the source or binary > artifacts are in any way "official" or supported by the Apache > K> Phoenix project. The source and artifacts are provided by me in a > personal capacity for the convenience of would-be Phoenix users > K> that also use CDH. Please don't contact the Apache Phoenix project > for any issues regarding this source and these binaries. > K> > K> -- > K> Best regards, > K> > K>- Andy > K> > K> Problems worthy of attack prove their worth by hitting back. - > Piet Hein (via Tom White) > > > > -- > With best wishes, Alex Ott > http://alexott.blogspot.com/http://alexott.net/ > http://alexott-ru.blogspot.com/ > Skype: alex.ott >
Re: Announcing phoenix-for-cloudera 4.6.0
On the branch: 4.5-HBase-1.0-cdh5, I set cdh version to 5.5.1 in pom and building the package produces following errors. Repo: https://github.com/chiastic-security/phoenix-for-cloudera [ERROR] ~/phoenix_related/phoenix-for-cloudera/phoenix-core/src/main/java/org/apache/phoenix/trace/util/Tracing.java:[176,82] cannot find symbol [ERROR] symbol: method getParentId() [ERROR] location: variable span of type org.apache.htrace.Span [ERROR] ~/phoenix_related/phoenix-for-cloudera/phoenix-core/src/main/java/org/apache/phoenix/trace/TraceReader.java:[129,31] cannot find symbol [ERROR] symbol: variable ROOT_SPAN_ID [ERROR] location: interface org.apache.htrace.Span [ERROR] ~/phoenix_related/phoenix-for-cloudera/phoenix-core/src/main/java/org/apache/phoenix/trace/TraceReader.java:[159,38] cannot find symbol [ERROR] symbol: variable ROOT_SPAN_ID [ERROR] location: interface org.apache.htrace.Span [ERROR] ~/phoenix_related/phoenix-for-cloudera/phoenix-core/src/main/java/org/apache/phoenix/trace/TraceReader.java:[162,31] cannot find symbol [ERROR] symbol: variable ROOT_SPAN_ID [ERROR] location: interface org.apache.htrace.Span [ERROR] ~/phoenix_related/phoenix-for-cloudera/phoenix-core/src/main/java/org/apache/phoenix/trace/TraceReader.java:[337,38] cannot find symbol [ERROR] symbol: variable ROOT_SPAN_ID [ERROR] location: interface org.apache.htrace.Span [ERROR] ~/phoenix_related/phoenix-for-cloudera/phoenix-core/src/main/java/org/apache/phoenix/trace/TraceReader.java:[339,42] cannot find symbol [ERROR] symbol: variable ROOT_SPAN_ID [ERROR] location: interface org.apache.htrace.Span [ERROR] ~/phoenix_related/phoenix-for-cloudera/phoenix-core/src/main/java/org/apache/phoenix/trace/TraceReader.java:[359,58] cannot find symbol [ERROR] symbol: variable ROOT_SPAN_ID [ERROR] location: interface org.apache.htrace.Span [ERROR] ~/phoenix_related/phoenix-for-cloudera/phoenix-core/src/main/java/org/apache/phoenix/trace/TraceMetricSource.java:[99,74] cannot find symbol [ERROR] symbol: method getParentId() [ERROR] location: variable span of type org.apache.htrace.Span [ERROR] ~/phoenix_related/phoenix-for-cloudera/phoenix-core/src/main/java/org/apache/phoenix/trace/TraceMetricSource.java:[110,60] incompatible types [ERROR] required: java.util.Map [ERROR] found:java.util.Map [ERROR] ~/phoenix_related/phoenix-for-cloudera/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java:[550,57] is not abstract and does not override abstract method nextRaw(java.util.List,org.apache.hadoop.hbase.regionserver.ScannerContext) in org.apache.hadoop.hbase.regionserver.RegionScanner On Fri, Jan 15, 2016 at 6:20 PM, Krishna wrote: > Thanks Andrew. Are binaries available for CDH5.5.x? > > On Tue, Nov 3, 2015 at 9:10 AM, Andrew Purtell > wrote: > >> Today I pushed a new branch '4.6-HBase-1.0-cdh5' and the tag >> 'v4.6.0-cdh5.4.5' (58fcfa6) to >> https://github.com/chiastic-security/phoenix-for-cloudera. This is the >> Phoenix 4.6.0 release, modified to build against CDH 5.4.5 and possibly >> (but not tested) subsequent CDH releases. >> >> If you want release tarballs I built from this, get them here: >> >> Binaries >> >> http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-bin.tar.gz >> >> http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-bin.tar.gz.asc >> (signature) >> >> http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-bin.tar.gz.md5 >> (MD5 sum) >> >> http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-bin.tar.gz.sha >> (SHA-1 sum) >> >> >> Source >> >> http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-src.tar.gz >> >> >> >> http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-src.tar.gz.asc >> (signature) >> >> >> http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-src.tar.gz.md5 >> (MD5 sum) >> >> >> http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-src.tar.gz.sha >> (SHA1-sum) >> >> >> Signed with my code signing key D5365CCD. >> >> The source and these binaries incorporate changes from the Cloudera Labs >> fork of Phoenix (https://github.com/cloudera-labs/phoenix), licensed >> under the ASL v2, Neither the source or binary artifacts are in any way >> "official" or supported by the Apache Phoenix project. The source and >> artifacts are provided by me in a personal capacity for the convenience of >> would-be Phoenix users that also use CDH. Please don't contact the Apache >> Phoenix project for any issues regarding this source and these binaries. >> >> -- >> Best regards, >> >>- Andy >> >> Problems worthy of attack prove their worth by hitting back. - Piet Hein >> (via Tom White) >> > >
Re: Announcing phoenix-for-cloudera 4.6.0
Thanks Andrew. Are binaries available for CDH5.5.x? On Tue, Nov 3, 2015 at 9:10 AM, Andrew Purtell wrote: > Today I pushed a new branch '4.6-HBase-1.0-cdh5' and the tag > 'v4.6.0-cdh5.4.5' (58fcfa6) to > https://github.com/chiastic-security/phoenix-for-cloudera. This is the > Phoenix 4.6.0 release, modified to build against CDH 5.4.5 and possibly > (but not tested) subsequent CDH releases. > > If you want release tarballs I built from this, get them here: > > Binaries > > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-bin.tar.gz > > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-bin.tar.gz.asc > (signature) > > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-bin.tar.gz.md5 > (MD5 sum) > > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-bin.tar.gz.sha > (SHA-1 sum) > > > Source > > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-src.tar.gz > > > > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-src.tar.gz.asc > (signature) > > > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-src.tar.gz.md5 > (MD5 sum) > > > http://apurtell.s3.amazonaws.com/phoenix/phoenix-4.6.0-cdh5.4.5-src.tar.gz.sha > (SHA1-sum) > > > Signed with my code signing key D5365CCD. > > The source and these binaries incorporate changes from the Cloudera Labs > fork of Phoenix (https://github.com/cloudera-labs/phoenix), licensed > under the ASL v2, Neither the source or binary artifacts are in any way > "official" or supported by the Apache Phoenix project. The source and > artifacts are provided by me in a personal capacity for the convenience of > would-be Phoenix users that also use CDH. Please don't contact the Apache > Phoenix project for any issues regarding this source and these binaries. > > -- > Best regards, > >- Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) >
Re: spark plugin with java
Yes, I will create new tickets for any issues that I may run into. Another question: For now I'm pursuing the option of creating a dataframe as shown in my previous email. How does spark handle parallelization in this case? Does it use phoenix metadata on splits? On Wed, Dec 2, 2015 at 11:02 AM, Josh Mahonin wrote: > Hi Krishna, > > That's great to hear. You're right, the plugin itself should be backwards > compatible to Spark 1.3.1 and should be for any version of Phoenix past > 4.4.0, though I can't guarantee that to be the case forever. As well, I > don't know how much usage there is across the board using the Java API and > DataFrames, you in fact may be the first. If you are encountering any > errors with it could you file a JIRA please with any stack traces you see? > > Since Spark is a very quickly changing project, often they update internal > functionality that we sometimes lag behind on support for, and as a result > there's no direct mapping between specific Phoenix versions and specific > Spark versions. We add new support as fast as we get patches, essentially. > > My general recommendation is to stay back a major version on Spark if > possible, but if you need to use the latest Spark releases, try use the > latest Phoenix release as well. The DataFrame support in Phoenix, for > instance, has had many patches and improvements recently that older > versions are missing. > > Thanks, > > Josh > > On Wed, Dec 2, 2015 at 1:40 PM, Krishna wrote: > >> Yes, that works for Spark 1.4.x. Website says Spark 1.3.1+ for Spark >> plugin, is that accurate? >> >> For Spark 1.3.1, I created a dataframe as follows (could not use the >> plugin): >> *Map options = new HashMap();* >> *options.put("url", PhoenixRuntime.JDBC_PROTOCOL + >> PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + zkQuorum);* >> *options.put("dbtable", "TABLE_NAME");* >> >> *SQLContext sqlContext = new SQLContext(sc);* >> *DataFrame jdbcDF = sqlContext.load("jdbc", >> options).filter("COL_NAME > SOME_VALUE");* >> >> Also, it isn't immediately obvious which version of Spark was used in >> building Phoenix artifacts available on Maven. May be, it's worth putting >> it on the website. Let me know if the mapping below is incorrect. >> >> Phoenix 4.4.x <--> Spark 1.4.0 >> > Phoenix 4.5.x <--> Spark 1.5.0 >> > Phoenix 4.6.x <--> Spark 1.5.0 >> >> >> On Tue, Dec 1, 2015 at 7:05 PM, Josh Mahonin wrote: >> >> > Hi Krishna, >> > >> > I've not tried it in Java at all, but I as of Spark 1.4+ the DataFrame >> API >> > should be unified between Scala and Java, so the following may work for >> you: >> > >> > DataFrame df = sqlContext.read() >> > .format("org.apache.phoenix.spark") >> > .option("table", "TABLE1") >> > .option("zkUrl", "") >> > .load(); >> > >> > Note that 'zkUrl' must be set to your Phoenix URL, and passing a 'conf' >> > parameter isn't supported. Please let us know back here if this works >> out >> > for you, I'd love to update the documentation and unit tests if it >> works. >> > >> > Josh >> > >> > On Tue, Dec 1, 2015 at 6:30 PM, Krishna wrote: >> > >> >> Hi, >> >> >> >> Is there a working example for using spark plugin in Java? >> Specifically, >> >> what's the java equivalent for creating a dataframe as shown here in >> scala: >> >> >> >> val df = sqlContext.phoenixTableAsDataFrame("TABLE1", Array("ID", >> "COL1"), conf = configuration) >> >> >> >> >> > >> > >
Re: spark plugin with java
Yes, that works for Spark 1.4.x. Website says Spark 1.3.1+ for Spark plugin, is that accurate? For Spark 1.3.1, I created a dataframe as follows (could not use the plugin): *Map options = new HashMap();* *options.put("url", PhoenixRuntime.JDBC_PROTOCOL + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + zkQuorum);* *options.put("dbtable", "TABLE_NAME");* *SQLContext sqlContext = new SQLContext(sc);* *DataFrame jdbcDF = sqlContext.load("jdbc", options).filter("COL_NAME > SOME_VALUE");* Also, it isn't immediately obvious which version of Spark was used in building Phoenix artifacts available on Maven. May be, it's worth putting it on the website. Let me know if the mapping below is incorrect. Phoenix 4.4.x <--> Spark 1.4.0 > Phoenix 4.5.x <--> Spark 1.5.0 > Phoenix 4.6.x <--> Spark 1.5.0 On Tue, Dec 1, 2015 at 7:05 PM, Josh Mahonin wrote: > Hi Krishna, > > I've not tried it in Java at all, but I as of Spark 1.4+ the DataFrame API > should be unified between Scala and Java, so the following may work for you: > > DataFrame df = sqlContext.read() > .format("org.apache.phoenix.spark") > .option("table", "TABLE1") > .option("zkUrl", "") > .load(); > > Note that 'zkUrl' must be set to your Phoenix URL, and passing a 'conf' > parameter isn't supported. Please let us know back here if this works out > for you, I'd love to update the documentation and unit tests if it works. > > Josh > > On Tue, Dec 1, 2015 at 6:30 PM, Krishna wrote: > >> Hi, >> >> Is there a working example for using spark plugin in Java? Specifically, >> what's the java equivalent for creating a dataframe as shown here in scala: >> >> val df = sqlContext.phoenixTableAsDataFrame("TABLE1", Array("ID", "COL1"), >> conf = configuration) >> >> >
spark plugin with java
Hi, Is there a working example for using spark plugin in Java? Specifically, what's the java equivalent for creating a dataframe as shown here in scala: val df = sqlContext.phoenixTableAsDataFrame("TABLE1", Array("ID", "COL1"), conf = configuration)
Re: Table undefined error even though table exists
The interesting thing is, if a table is created using phoenix, they are seen inside hbase shell. But, if a table is created using hbase shell, they are not seen using phoenix command line interface. I use ./sqlline.py localshost where hbase is running in psuedo distributed mode with zookeepr at 2181. I am trying to see the list of tables on phoenix with command : !tables , Is it right or something am i missing? Why is this unexpected behavior? On Fri, Jul 31, 2015 at 1:10 PM, Vamshi Krishna wrote: > Whatever you have mentioned is correct. Thank you. But. when i do !tables, > already existing tables that were created before phoenix installed should > also be seen right ? Those tables are not seen here. And moreover, when i > create tables in hbase, i don't use any schema explicity such as 'stats' or > 'system'. > > On Thu, Jul 30, 2015 at 9:40 PM, Ravi Kiran > wrote: > >> Hi Vamsi, >> >>Please give the full table name in select. >> SELECT * FROM STATS.PROD_METRICS; >> >> Regards >> Ravi >> >> On Thu, Jul 30, 2015 at 6:33 AM, Vamshi Krishna >> wrote: >> >> > Hi, >> > I am trying to access my hbase running on my local machine with >> zookeeper >> > at localhost:2181. I installed phoenix-3.3.1-bin and trying to access an >> > already existing hbase tabe, but could not. So, simply to test, i >> created a >> > table using phoenix commandline and see it when i run !tables command. >> but >> > when i run selet command, it shows error. >> > >> > >> > This is what I am doing. >> > >> > 0: jdbc:phoenix:localhost> CREATE TABLE stats.prod_metrics ( host >> char(50) >> > not null, created_date date not null, >> > >> > . . . . . . . . . . . . .> txn_count bigint CONSTRAINT pk PRIMARY >> KEY >> > (host, created_date) ); >> > >> > No rows affected (1.82 seconds) >> > >> > 0: jdbc:phoenix:localhost> !tables >> > >> > >> > >> *+--+--+--+---+* >> > >> > *| ** TABLE_CAT** | ** >> > TABLE_SCHEM ** | ** TABLE_NAME >> ** >> > | ** TABLE_TYPE** |* >> > >> > >> > >> *+--+--+--+---+* >> > >> > *| ** | *SYSTEM >> > * | *CATALOG * | *SYSTEM >> > TABLE * |* >> > >> > *| ** | *SYSTEM >> > * | *SEQUENCE* | *SYSTEM >> > TABLE * |* >> > >> > *| ** | *SYSTEM >> > * | *STATS * | *SYSTEM >> > TABLE * |* >> > >> > *| ** | *STATS >> > * | *PROD_METRICS* | *TABLE >> > * |* >> > >> > >> > >> *+--+--+--+---+* >> > >> > 0: jdbc:phoenix:localhost> select * from PROD_METRICS; >> > >> > *Error: ERROR 1012 (42M03): Table undefined. tableName=PROD_METRICS >> > (state=42M03,code=1012)* >> > >> > org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): >> Table >> > undefined. tableName=PROD_METRICS >> > >> > at >> > >> > >> org.apache.phoenix.compile.FromCompiler$BaseColumnResolver.createTableRef(FromCompiler.java:336) >> > >> > at >> > >> > >> org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.(FromCompiler.java:236) >> > >> > at >> > >> > >> org.apache.phoenix.compile.FromCompiler.getResolverForQuery(FromCompiler.java:159) >> > >> > at >> > >> > >> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:318) >> > >> > at >> > >> > >> org.apache.phoenix.jdbc.PhoenixStatement$Executab
Re: Table undefined error even though table exists
Whatever you have mentioned is correct. Thank you. But. when i do !tables, already existing tables that were created before phoenix installed should also be seen right ? Those tables are not seen here. And moreover, when i create tables in hbase, i don't use any schema explicity such as 'stats' or 'system'. On Thu, Jul 30, 2015 at 9:40 PM, Ravi Kiran wrote: > Hi Vamsi, > >Please give the full table name in select. > SELECT * FROM STATS.PROD_METRICS; > > Regards > Ravi > > On Thu, Jul 30, 2015 at 6:33 AM, Vamshi Krishna > wrote: > > > Hi, > > I am trying to access my hbase running on my local machine with > zookeeper > > at localhost:2181. I installed phoenix-3.3.1-bin and trying to access an > > already existing hbase tabe, but could not. So, simply to test, i > created a > > table using phoenix commandline and see it when i run !tables command. > but > > when i run selet command, it shows error. > > > > > > This is what I am doing. > > > > 0: jdbc:phoenix:localhost> CREATE TABLE stats.prod_metrics ( host > char(50) > > not null, created_date date not null, > > > > . . . . . . . . . . . . .> txn_count bigint CONSTRAINT pk PRIMARY KEY > > (host, created_date) ); > > > > No rows affected (1.82 seconds) > > > > 0: jdbc:phoenix:localhost> !tables > > > > > > > *+--+--+--+---+* > > > > *| ** TABLE_CAT** | ** > > TABLE_SCHEM ** | ** TABLE_NAME > ** > > | ** TABLE_TYPE** |* > > > > > > > *+--+--+--+---+* > > > > *| ** | *SYSTEM > > * | *CATALOG * | *SYSTEM > > TABLE * |* > > > > *| ** | *SYSTEM > > * | *SEQUENCE* | *SYSTEM > > TABLE * |* > > > > *| ** | *SYSTEM > > * | *STATS * | *SYSTEM > > TABLE * |* > > > > *| ** | *STATS > > * | *PROD_METRICS* | *TABLE > > * |* > > > > > > > *+--+--+--+---+* > > > > 0: jdbc:phoenix:localhost> select * from PROD_METRICS; > > > > *Error: ERROR 1012 (42M03): Table undefined. tableName=PROD_METRICS > > (state=42M03,code=1012)* > > > > org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): > Table > > undefined. tableName=PROD_METRICS > > > > at > > > > > org.apache.phoenix.compile.FromCompiler$BaseColumnResolver.createTableRef(FromCompiler.java:336) > > > > at > > > > > org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.(FromCompiler.java:236) > > > > at > > > > > org.apache.phoenix.compile.FromCompiler.getResolverForQuery(FromCompiler.java:159) > > > > at > > > > > org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:318) > > > > at > > > > > org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:308) > > > > at > > > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:225) > > > > at > > > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:221) > > > > at > > > > > org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:54) > > > > at > > > > > org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:221) > > > > at > > > > > org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1059) > > > > at sqlline.Commands.execute(Commands.java:822) > > > > at sqlline.Commands.sql(Commands.java:732) > > > > at sqlline.SqlLine.dispatch(SqlLine.java:808) > > > > at sqlline.SqlLine.begin(SqlLine.java:681) > > > &g
Table undefined error even though table exists
Hi, I am trying to access my hbase running on my local machine with zookeeper at localhost:2181. I installed phoenix-3.3.1-bin and trying to access an already existing hbase tabe, but could not. So, simply to test, i created a table using phoenix commandline and see it when i run !tables command. but when i run selet command, it shows error. This is what I am doing. 0: jdbc:phoenix:localhost> CREATE TABLE stats.prod_metrics ( host char(50) not null, created_date date not null, . . . . . . . . . . . . .> txn_count bigint CONSTRAINT pk PRIMARY KEY (host, created_date) ); No rows affected (1.82 seconds) 0: jdbc:phoenix:localhost> !tables *+--+--+--+---+* *| ** TABLE_CAT** | ** TABLE_SCHEM ** | ** TABLE_NAME ** | ** TABLE_TYPE** |* *+--+--+--+---+* *| ** | *SYSTEM * | *CATALOG * | *SYSTEM TABLE * |* *| ** | *SYSTEM * | *SEQUENCE* | *SYSTEM TABLE * |* *| ** | *SYSTEM * | *STATS * | *SYSTEM TABLE * |* *| ** | *STATS * | *PROD_METRICS* | *TABLE * |* *+--+--+--+---+* 0: jdbc:phoenix:localhost> select * from PROD_METRICS; *Error: ERROR 1012 (42M03): Table undefined. tableName=PROD_METRICS (state=42M03,code=1012)* org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table undefined. tableName=PROD_METRICS at org.apache.phoenix.compile.FromCompiler$BaseColumnResolver.createTableRef(FromCompiler.java:336) at org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.(FromCompiler.java:236) at org.apache.phoenix.compile.FromCompiler.getResolverForQuery(FromCompiler.java:159) at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:318) at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:308) at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:225) at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:221) at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:54) at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:221) at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1059) at sqlline.Commands.execute(Commands.java:822) at sqlline.Commands.sql(Commands.java:732) at sqlline.SqlLine.dispatch(SqlLine.java:808) at sqlline.SqlLine.begin(SqlLine.java:681) at sqlline.SqlLine.start(SqlLine.java:398) at sqlline.SqlLine.main(SqlLine.java:292) 0: jdbc:phoenix:localhost> !tables *+--+--+--+---+* *| ** TABLE_CAT** | ** TABLE_SCHEM ** | ** TABLE_NAME ** | ** TABLE_TYPE** |* *+--+--+--+---+* *| ** | *SYSTEM * | *CATALOG * | *SYSTEM TABLE * |* *| ** | *SYSTEM * | *SEQUENCE* | *SYSTEM TABLE * |* *| ** | *SYSTEM * | *STATS * | *SYSTEM TABLE * |* *| ** | *STATS * | *PROD_METRICS* | *TABLE * |* *+--+--+--+---+* Can any one help?? -- *Regards* *Vamshi*
Re: Composite primary keys
Thanks Jeffrey. Is zero byte char separator used between fixed width variables? From the text on the website, it looks like separator byte is used only between variable length data types - if I'm understanding it correctly. Our composite row keys are formed by simply concatenating the values > together, with a zero byte character used as a separator after a variable > length type. On Tue, Mar 3, 2015 at 10:32 PM, Jeffrey Zhong wrote: > > Composite row keys are formed by simply concatenating the values together, > with a zero byte character used as a separator after a variable length > type. > > You can check code on PTableImpl#newKey > > On 3/3/15, 10:02 PM, "Krishna" wrote: > > >Hi, > > > >How does phoenix store composite primary keys in HBase? > >For example, if the primary key is a composite of two columns: > >col1 short > >col2 integer > > > >Does phoenix concatenate 1 byte short with 4 byte integer to create a 5 > >byte array to make HBase rowkey? > > > >Please point me to the code that I can refer for details. > > > >Thanks > >
Composite primary keys
Hi, How does phoenix store composite primary keys in HBase? For example, if the primary key is a composite of two columns: col1 short col2 integer Does phoenix concatenate 1 byte short with 4 byte integer to create a 5 byte array to make HBase rowkey? Please point me to the code that I can refer for details. Thanks
Re: PhoenixOutputFormat in MR job
Ravi, thanks. If the target table is salted, do I need to compute the leading byte (as i understand, its a hash value) in the mapper? On Sunday, March 1, 2015, Ravi Kiran wrote: > Hi Krishna, > > I assume you have already taken a look at the example here > http://phoenix.apache.org/phoenix_mr.html > > > Is there a need to compute hash byte in the MR job? >Can you please elaborate a bit more on what hash byte is ? > > > Are keys and values stored in BytesWritable before doing a > "context.write(...)" in the mapper? > The Key-values from a mapper to reducer are the usual > Writable/WritableComparable instances and you can definitely write > BytesWritable . > > Regards > Ravi > > On Sun, Mar 1, 2015 at 10:04 PM, Krishna > wrote: > >> Could someone comment of following questions regarding the usage of >> PhoenixOutputFormat in a standalone MR job: >> >>- Is there a need to compute hash byte in the MR job? >>- Are keys and values stored in BytesWritable before doing a >>"context.write(...)" in the mapper? >> >> >> >
PhoenixOutputFormat in MR job
Could someone comment of following questions regarding the usage of PhoenixOutputFormat in a standalone MR job: - Is there a need to compute hash byte in the MR job? - Are keys and values stored in BytesWritable before doing a "context.write(...)" in the mapper?
Salt buckets optimization
Are there any recommendations for estimating and optimizing salt buckets during table creation time? What, if any, are the cons of having high number (200+) of salt buckets? Is it possible to update salt buckets after table is created? Thanks
Re: Reverse scan
That's great; thanks James, Ted. On Mon, Dec 1, 2014 at 9:13 PM, James Taylor wrote: > Yes, as Ted points out, Phoenix will use a reverse scan to optimize an > ORDER BY. > > On Mon, Dec 1, 2014 at 7:52 PM, Ted Yu wrote: > > Please take a look at BaseQueryPlan#iterator(): > > > > if (OrderBy.REV_ROW_KEY_ORDER_BY.equals(orderBy)) { > > > > ScanUtil.setReversed(scan); > > > > Cheers > > > > On Mon, Dec 1, 2014 at 7:45 PM, Krishna wrote: > > > >> Hi, > >> > >> Does Phoenix support reverse scan as explained in HBASE-4811 ( > >> https://issues.apache.org/jira/browse/HBASE-4811). > >> >
Reverse scan
Hi, Does Phoenix support reverse scan as explained in HBASE-4811 ( https://issues.apache.org/jira/browse/HBASE-4811).
[jira] [Commented] (PHOENIX-1442) Alter Index double normalize Index Table Name
[ https://issues.apache.org/jira/browse/PHOENIX-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208211#comment-14208211 ] VAMSI KRISHNA ATTLURI commented on PHOENIX-1442: How to apply this patch in my HDP 2.2 sandbox to test it? > Alter Index double normalize Index Table Name > - > > Key: PHOENIX-1442 > URL: https://issues.apache.org/jira/browse/PHOENIX-1442 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.2 >Reporter: Jeffrey Zhong >Assignee: Jeffrey Zhong > Attachments: PHOENIX-1442.patch > > > The issue reported by Vamsi Krishna Attluri on Phoenix 4.1+ with following > repro steps: > {noformat} > create index "test:table1indx1" on "test:table1"(colfam1.col3 desc); > alter index "test:table1indx1" on "test:table1" disable; > Error: ERROR 1012 (42M03): Table undefined. tableName=TEST:TABLE1INDX1 > (state=42M03,code=1012) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
MasterNotRunningException during bulk-load
Hi, During bulk-load map-reduce job, this error causes some of the mappers to fail. However, mapreduce job continues to run and finishes successfully. I did check that Master and zookeeper were running at the time of this error. Any thoughts why this might be happening? CDH 5.2 Phoenix 4.1 HBase 0.98 Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.MasterNotRunningException: Can't get connection to ZooKeeper: KeeperErrorCode = ConnectionLoss for /hbase at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:101) at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:817) Caused by: org.apache.hadoop.hbase.MasterNotRunningException: Can't get connection to ZooKeeper: KeeperErrorCode = ConnectionLoss for /hbase at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:885) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.access$600(HConnectionManager.java:573) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(HConnectionManager.java:1577) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$StubMaker.makeStub(HConnectionManager.java:1623) ... 34 more Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
[jira] [Commented] (PHOENIX-1363) java.lang.ArrayIndexOutOfBoundsException with min/max query on CHAR column with '0' prefixed values
[ https://issues.apache.org/jira/browse/PHOENIX-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176752#comment-14176752 ] Hari Krishna Dara commented on PHOENIX-1363: I looked at the branches and found nothing newer than 4.1.0 and that is what I was using. I tried using master (5.0-SNAPSHOT) but it seemed to be incompatible in my cluster. What specific branch are you referring to? > java.lang.ArrayIndexOutOfBoundsException with min/max query on CHAR column > with '0' prefixed values > --- > > Key: PHOENIX-1363 > URL: https://issues.apache.org/jira/browse/PHOENIX-1363 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.1 > Environment: HBase 0.98.4 > RHEL 6.5 >Reporter: Hari Krishna Dara > Labels: aggregate, char > > While playing with the queries to reproduce PHOENIX-1362, I got the below > exception (take the same schema and data as in PHOENIX-1362): > {noformat} > 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL2), min(VAL3) from TT; > +++ > | MIN(VAL2) | MIN(VAL3) | > +++ > java.lang.ArrayIndexOutOfBoundsException > at java.lang.System.arraycopy(Native Method) > at > org.apache.phoenix.schema.KeyValueSchema.writeVarLengthField(KeyValueSchema.java:150) > at > org.apache.phoenix.schema.KeyValueSchema.toBytes(KeyValueSchema.java:116) > at > org.apache.phoenix.schema.KeyValueSchema.toBytes(KeyValueSchema.java:91) > at > org.apache.phoenix.expression.aggregator.Aggregators.toBytes(Aggregators.java:109) > at > org.apache.phoenix.iterate.GroupedAggregatingResultIterator.next(GroupedAggregatingResultIterator.java:83) > at > org.apache.phoenix.iterate.UngroupedAggregatingResultIterator.next(UngroupedAggregatingResultIterator.java:39) > at > org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:732) > at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2429) > at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2074) > at sqlline.SqlLine.print(SqlLine.java:1735) > at sqlline.SqlLine$Commands.execute(SqlLine.java:3683) > at sqlline.SqlLine$Commands.sql(SqlLine.java:3584) > at sqlline.SqlLine.dispatch(SqlLine.java:821) > at sqlline.SqlLine.begin(SqlLine.java:699) > at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441) > at sqlline.SqlLine.main(SqlLine.java:424) > 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL1), min(VAL2) from TT; > +++ > | MIN(VAL1) | MIN(VAL2) | > +++ > | 0 | null | > +++ > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1362) Min/max query on CHAR columns containing values with '0' as prefix always returns null
[ https://issues.apache.org/jira/browse/PHOENIX-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174801#comment-14174801 ] Hari Krishna Dara commented on PHOENIX-1362: I just found a workaround, feel free to lower the priority: {noformat} 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(cast(VAL2 as VARCHAR)), max(cast(VAL2 as VARCHAR)) from TT; +---+---+ | MIN(TO_VARCHAR(VAL2)) | MAX(TO_VARCHAR(VAL2)) | +---+---+ | 00| 02| +---+---+ {noformat} > Min/max query on CHAR columns containing values with '0' as prefix always > returns null > -- > > Key: PHOENIX-1362 > URL: https://issues.apache.org/jira/browse/PHOENIX-1362 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.1 > Environment: HBase 0.98.4 > RHEL 6.5 >Reporter: Hari Krishna Dara > Labels: aggregate, char > > - Create a table with CHAR type and insert a few strings that start with 0. > - Select min()/max() on the column, you always get null value. > {noformat} > 0: jdbc:phoenix:isthbase01-mnds2-1-crd> create table TT(VAL1 integer not > null, VAL2 char(2), val3 varchar, VAL4 varchar constraint PK primary key > (VAL1)); > 0: jdbc:phoenix:isthbase01-mnds2-1-crd> upsert into TT values (0, '00', '00', > '0'); > 0: jdbc:phoenix:isthbase01-mnds2-1-crd> upsert into TT values (1, '01', '01', > '1'); > 0: jdbc:phoenix:isthbase01-mnds2-1-crd> upsert into TT values (2, '02', '02', > '2'); > 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select * from TT; > ++--+++ > |VAL1| VAL2 |VAL3|VAL4| > ++--+++ > | 0 | 00 | 00 | 0 | > | 1 | 01 | 01 | 1 | > | 2 | 02 | 02 | 2 | > ++--+++ > 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL1), max(VAL1) from TT; > +++ > | MIN(VAL1) | MAX(VAL1) | > +++ > | 0 | 2 | > +++ > 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL2), max(VAL2) from TT; > +++ > | MIN(VAL2) | MAX(VAL2) | > +++ > | null | null | > +++ > 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL3), max(VAL3) from TT; > +++ > | MIN(VAL3) | MAX(VAL3) | > +++ > | 00 | 02 | > +++ > 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL4), max(VAL4) from TT; > +++ > | MIN(VAL4) | MAX(VAL4) | > +++ > | 0 | 2 | > +++ > {noformat} > As you can see, the query on VAL2 which is of type CHAR(2) returns null, > while the same exact values on VAL3 which is of type VARCHAR work as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-1363) java.lang.ArrayIndexOutOfBoundsException with min/max query on CHAR column with '0' prefixed values
Hari Krishna Dara created PHOENIX-1363: -- Summary: java.lang.ArrayIndexOutOfBoundsException with min/max query on CHAR column with '0' prefixed values Key: PHOENIX-1363 URL: https://issues.apache.org/jira/browse/PHOENIX-1363 Project: Phoenix Issue Type: Bug Affects Versions: 4.1 Environment: HBase 0.98.4 RHEL 6.5 Reporter: Hari Krishna Dara While playing with the queries to reproduce PHOENIX-1362, I got the below exception (take the same schema and data as in PHOENIX-1362): {noformat} 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL2), min(VAL3) from TT; +++ | MIN(VAL2) | MIN(VAL3) | +++ java.lang.ArrayIndexOutOfBoundsException at java.lang.System.arraycopy(Native Method) at org.apache.phoenix.schema.KeyValueSchema.writeVarLengthField(KeyValueSchema.java:150) at org.apache.phoenix.schema.KeyValueSchema.toBytes(KeyValueSchema.java:116) at org.apache.phoenix.schema.KeyValueSchema.toBytes(KeyValueSchema.java:91) at org.apache.phoenix.expression.aggregator.Aggregators.toBytes(Aggregators.java:109) at org.apache.phoenix.iterate.GroupedAggregatingResultIterator.next(GroupedAggregatingResultIterator.java:83) at org.apache.phoenix.iterate.UngroupedAggregatingResultIterator.next(UngroupedAggregatingResultIterator.java:39) at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:732) at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2429) at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2074) at sqlline.SqlLine.print(SqlLine.java:1735) at sqlline.SqlLine$Commands.execute(SqlLine.java:3683) at sqlline.SqlLine$Commands.sql(SqlLine.java:3584) at sqlline.SqlLine.dispatch(SqlLine.java:821) at sqlline.SqlLine.begin(SqlLine.java:699) at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441) at sqlline.SqlLine.main(SqlLine.java:424) 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL1), min(VAL2) from TT; +++ | MIN(VAL1) | MIN(VAL2) | +++ | 0 | null | +++ {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-1362) Min/max query on CHAR columns containing values with '0' as prefix always returns null
[ https://issues.apache.org/jira/browse/PHOENIX-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna Dara updated PHOENIX-1362: --- Description: - Create a table with CHAR type and insert a few strings that start with 0. - Select min()/max() on the column, you always get null value. {noformat} 0: jdbc:phoenix:isthbase01-mnds2-1-crd> create table TT(VAL1 integer not null, VAL2 char(2), val3 varchar, VAL4 varchar constraint PK primary key (VAL1)); 0: jdbc:phoenix:isthbase01-mnds2-1-crd> upsert into TT values (0, '00', '00', '0'); 0: jdbc:phoenix:isthbase01-mnds2-1-crd> upsert into TT values (1, '01', '01', '1'); 0: jdbc:phoenix:isthbase01-mnds2-1-crd> upsert into TT values (2, '02', '02', '2'); 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select * from TT; ++--+++ |VAL1| VAL2 |VAL3|VAL4| ++--+++ | 0 | 00 | 00 | 0 | | 1 | 01 | 01 | 1 | | 2 | 02 | 02 | 2 | ++--+++ 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL1), max(VAL1) from TT; +++ | MIN(VAL1) | MAX(VAL1) | +++ | 0 | 2 | +++ 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL2), max(VAL2) from TT; +++ | MIN(VAL2) | MAX(VAL2) | +++ | null | null | +++ 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL3), max(VAL3) from TT; +++ | MIN(VAL3) | MAX(VAL3) | +++ | 00 | 02 | +++ 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL4), max(VAL4) from TT; +++ | MIN(VAL4) | MAX(VAL4) | +++ | 0 | 2 | +++ {noformat} As you can see, the query on VAL2 which is of type CHAR(2) returns null, while the same exact values on VAL3 which is of type VARCHAR work as expected. was: - Create a table with CHAR type and insert a few strings that start with 0. - Select min()/max() on the column, you always get null value. 0: jdbc:phoenix:isthbase01-mnds2-1-crd> create table TT(VAL1 integer not null, VAL2 char(2), val3 varchar, VAL4 varchar constraint PK primary key (VAL1)); 0: jdbc:phoenix:isthbase01-mnds2-1-crd> upsert into TT values (0, '00', '00', '0'); 0: jdbc:phoenix:isthbase01-mnds2-1-crd> upsert into TT values (1, '01', '01', '1'); 0: jdbc:phoenix:isthbase01-mnds2-1-crd> upsert into TT values (2, '02', '02', '2'); 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select * from TT; ++--+++ |VAL1| VAL2 |VAL3|VAL4| ++--+++ | 0 | 00 | 00 | 0 | | 1 | 01 | 01 | 1 | | 2 | 02 | 02 | 2 | ++--+++ 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL1), max(VAL1) from TT; +++ | MIN(VAL1) | MAX(VAL1) | +++ | 0 | 2 | +++ 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL2), max(VAL2) from TT; +++ | MIN(VAL2) | MAX(VAL2) | +++ | null | null | +++ 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL3), max(VAL3) from TT; +++ | MIN(VAL3) | MAX(VAL3) | +++ | 00 | 02 | +++ 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL4), max(VAL4) from TT; +++ | MIN(VAL4) | MAX(VAL4) | +++ | 0 | 2 | +++ As you can see, the query on VAL2 which is of type CHAR(2) returns null, while the same exact values on VAL3 which is of type VARCHAR work as expected. > Min/max query on CHAR columns containing values with '0' as prefix always > returns null > -- > > Key: PHOENIX-1362 > URL: https://issues.apache.org/jira/browse/PHOENIX-1362 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.1 > Environment: HBase 0.98.4 > RHEL 6.5 >Reporter: Hari Krishna Dara > Labels: aggregate, char > > - C
[jira] [Created] (PHOENIX-1362) Min/max query on CHAR columns containing values with '0' as prefix always returns null
Hari Krishna Dara created PHOENIX-1362: -- Summary: Min/max query on CHAR columns containing values with '0' as prefix always returns null Key: PHOENIX-1362 URL: https://issues.apache.org/jira/browse/PHOENIX-1362 Project: Phoenix Issue Type: Bug Affects Versions: 4.1 Environment: HBase 0.98.4 RHEL 6.5 Reporter: Hari Krishna Dara - Create a table with CHAR type and insert a few strings that start with 0. - Select min()/max() on the column, you always get null value. 0: jdbc:phoenix:isthbase01-mnds2-1-crd> create table TT(VAL1 integer not null, VAL2 char(2), val3 varchar, VAL4 varchar constraint PK primary key (VAL1)); 0: jdbc:phoenix:isthbase01-mnds2-1-crd> upsert into TT values (0, '00', '00', '0'); 0: jdbc:phoenix:isthbase01-mnds2-1-crd> upsert into TT values (1, '01', '01', '1'); 0: jdbc:phoenix:isthbase01-mnds2-1-crd> upsert into TT values (2, '02', '02', '2'); 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select * from TT; ++--+++ |VAL1| VAL2 |VAL3|VAL4| ++--+++ | 0 | 00 | 00 | 0 | | 1 | 01 | 01 | 1 | | 2 | 02 | 02 | 2 | ++--+++ 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL1), max(VAL1) from TT; +++ | MIN(VAL1) | MAX(VAL1) | +++ | 0 | 2 | +++ 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL2), max(VAL2) from TT; +++ | MIN(VAL2) | MAX(VAL2) | +++ | null | null | +++ 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL3), max(VAL3) from TT; +++ | MIN(VAL3) | MAX(VAL3) | +++ | 00 | 02 | +++ 0: jdbc:phoenix:isthbase01-mnds2-1-crd> select min(VAL4), max(VAL4) from TT; +++ | MIN(VAL4) | MAX(VAL4) | +++ | 0 | 2 | +++ As you can see, the query on VAL2 which is of type CHAR(2) returns null, while the same exact values on VAL3 which is of type VARCHAR work as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: How does Phoenix treat null values?
Very simple to check this through hbase shell; its Option II since HFile storage is driven by HBase and not Phoenix. On Thu, Oct 16, 2014 at 11:59 AM, Krishna wrote: > Hi, > > If a table has following data: > -- > RowKey | cf.a| cf.b | cf.c > -- > RowKey1 | abc | | 123 > RowKey2 | abc| def | > -- > > How is data stored in HFiles for the table above (assuming I did not > upsert null values for cf.b and cf.c) ? Is it Option-I or Option-II? > > *Option-I:* > RowKey1, cf.a, 'abc' > RowKey1, cf.b, > RowKey1, cf.c, '123' > RowKey1, 0, > RowKey2, cf.a, 'abc' > RowKey2, cf.b, 'def' > RowKey2, cf.c, > RowKey2, 0, > > *Option-II:* > RowKey1, cf.a, 'abc' > RowKey1, cf.c, '123' > RowKey1, 0, > RowKey2, cf.a, 'abc' > RowKey2, cf.b, 'def' > RowKey2, 0, > > Thanks > >
How does Phoenix treat null values?
Hi, If a table has following data: -- RowKey | cf.a| cf.b | cf.c -- RowKey1 | abc | | 123 RowKey2 | abc| def | -- How is data stored in HFiles for the table above (assuming I did not upsert null values for cf.b and cf.c) ? Is it Option-I or Option-II? *Option-I:* RowKey1, cf.a, 'abc' RowKey1, cf.b, RowKey1, cf.c, '123' RowKey1, 0, RowKey2, cf.a, 'abc' RowKey2, cf.b, 'def' RowKey2, cf.c, RowKey2, 0, *Option-II:* RowKey1, cf.a, 'abc' RowKey1, cf.c, '123' RowKey1, 0, RowKey2, cf.a, 'abc' RowKey2, cf.b, 'def' RowKey2, 0, Thanks
[jira] [Created] (PHOENIX-1306) Option in CREATE TABLE & CREATE INDEX clauses to indicate that underlying table is already a phoenix table
Krishna created PHOENIX-1306: Summary: Option in CREATE TABLE & CREATE INDEX clauses to indicate that underlying table is already a phoenix table Key: PHOENIX-1306 URL: https://issues.apache.org/jira/browse/PHOENIX-1306 Project: Phoenix Issue Type: Improvement Reporter: Krishna When a HBase table is converted to Phoenix table (through CREATE TABLE or CREATE INDEX statements), Phoenix will do an upsert for each rowkey for column qualifier "_0". If the underlying HBase table is already in Phoenix format, there is no need to redo the upsert. Provide an option in relevant Phoenix DDLs to specify if the underlying HBase table is already in Phoenix format. This will ensure, Phoenix does not redo the upsert statements during the DDL. CREATE TABLE and CREATE INDEX are two examples of such DDLs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Upper limit on SALT_BUCKETS?
50 Region Servers for 100 TB such that each RS serves 10 regions (500 regions). At this stage, we haven't evaluated the impact on query latency when running with fewer regions, for ex., 50 RS and 250 regions. On Wed, Sep 24, 2014 at 11:50 AM, James Taylor wrote: > Would you be able to talk about your use case a bit and explain why you'd > need this to be higher? > Thanks, > James > > > On Wednesday, September 24, 2014, Krishna wrote: > >> Thanks... any plans of raising number of bytes for salt value? >> >> >> On Wed, Sep 24, 2014 at 10:22 AM, James Taylor >> wrote: >> >>> The salt byte is the first byte in your row key and that's the max >>> value for a byte (i.e. it'll be 0-255). >>> >>> On Wed, Sep 24, 2014 at 10:12 AM, Krishna wrote: >>> > Hi, >>> > >>> > According to Phoenix documentation >>> > >>> >> "Phoenix provides a way to transparently salt the row key with a >>> salting >>> >> byte for a particular table. You need to specify this in table >>> creation time >>> >> by specifying a table property “SALT_BUCKETS” with a value from 1 to >>> 256" >>> > >>> > >>> > Is 256 the max value that SALT_BUCKETS can take? If yes, could someone >>> > explain the reason for this upper bound? >>> > >>> > Krishna >>> > >>> >> >>
Re: Upper limit on SALT_BUCKETS?
Thanks... any plans of raising number of bytes for salt value? On Wed, Sep 24, 2014 at 10:22 AM, James Taylor wrote: > The salt byte is the first byte in your row key and that's the max > value for a byte (i.e. it'll be 0-255). > > On Wed, Sep 24, 2014 at 10:12 AM, Krishna wrote: > > Hi, > > > > According to Phoenix documentation > > > >> "Phoenix provides a way to transparently salt the row key with a salting > >> byte for a particular table. You need to specify this in table creation > time > >> by specifying a table property “SALT_BUCKETS” with a value from 1 to > 256" > > > > > > Is 256 the max value that SALT_BUCKETS can take? If yes, could someone > > explain the reason for this upper bound? > > > > Krishna > > >
Upper limit on SALT_BUCKETS?
Hi, According to Phoenix documentation "Phoenix provides a way to transparently salt the row key with a salting > byte for a particular table. You need to specify this in table creation > time by specifying a table property “SALT_BUCKETS” with a value from 1 to > 256" Is 256 the max value that SALT_BUCKETS can take? If yes, could someone explain the reason for this upper bound? Krishna