[jira] [Resolved] (PHOENIX-6114) Create shaded phoenix-pherf and remove lib dir from assembly

2021-02-04 Thread Istvan Toth (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Toth resolved PHOENIX-6114.
--
Fix Version/s: 4.17.0
   5.1.0
   Resolution: Fixed

Committed to master and 4.x (but not 4.16)
Thanks for the reviews [~yanxinyi] and [~elserj].

> Create shaded phoenix-pherf and remove lib dir from assembly
> 
>
> Key: PHOENIX-6114
> URL: https://issues.apache.org/jira/browse/PHOENIX-6114
> Project: Phoenix
>  Issue Type: Improvement
>  Components: core
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
> Fix For: 5.1.0, 4.17.0
>
>
> The Phoenix assembly has a poorly maintained lib directory, with dependencies 
> that we are for submodules that have since been moved to phoenix-queryserver. 
> The core phoenix jars are shaded, and do not or use the libraries here.
> phoenix-tracing-webapp is not included in the assembly, and thus does not 
> need dependencies there.
> That leaves phoenix-pherf as a possible consumer of these dependencies.
> I propose building refactoring phoenix-pherf similarly to 
> phoenix-queryserver, as a mostly self-contained shaded JAR, that only depends 
> on phoenix-client, and has the rest of its dependencies shaded in.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (PHOENIX-6356) missing row.clear() for dummy row in GlobalIndexRegionScanner

2021-02-04 Thread Istvan Toth (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Toth resolved PHOENIX-6356.
--
Fix Version/s: 4.17.0
   Resolution: Fixed

Merged to 4.x
Thank you [~tkhurana] and [~kozdemir]

> missing row.clear() for dummy row in GlobalIndexRegionScanner
> -
>
> Key: PHOENIX-6356
> URL: https://issues.apache.org/jira/browse/PHOENIX-6356
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.x
>Reporter: Istvan Toth
>Assignee: Tanuj Khurana
>Priority: Major
> Fix For: 4.17.0
>
>
> While porting PHOENIX-6182 to master, I run into a problem in 
> GlobalIndexRegionScanner where the returned dummy row is passed to the 
> next(row) call.
> I have added the missing clear statement to the master patch, which fixed the 
> test case.
> This doesn't seem to affect the test on 4.x, but breaks it on master.
> At least this fix should be backported to 4.x.
> Someone with a better understanding of the paging code should probably check 
> if similar issues are present at other places.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (PHOENIX-6356) missing row.clear() for dummy row in GlobalIndexRegionScanner

2021-02-04 Thread Tanuj Khurana (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanuj Khurana reassigned PHOENIX-6356:
--

Assignee: Tanuj Khurana

> missing row.clear() for dummy row in GlobalIndexRegionScanner
> -
>
> Key: PHOENIX-6356
> URL: https://issues.apache.org/jira/browse/PHOENIX-6356
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.x
>Reporter: Istvan Toth
>Assignee: Tanuj Khurana
>Priority: Major
>
> While porting PHOENIX-6182 to master, I run into a problem in 
> GlobalIndexRegionScanner where the returned dummy row is passed to the 
> next(row) call.
> I have added the missing clear statement to the master patch, which fixed the 
> test case.
> This doesn't seem to affect the test on 4.x, but breaks it on master.
> At least this fix should be backported to 4.x.
> Someone with a better understanding of the paging code should probably check 
> if similar issues are present at other places.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [Discuss] Phoenix Tech Talks

2021-02-04 Thread Josh Elser
Love it! I'll do my best to join in and listen (and participate later 
on, too ;))


I joined one from Calcite a week or two ago. They did a signup via 
Meetup.com and hosted it through Zoom. It felt very professional.


On 2/4/21 12:10 PM, Kadir Ozdemir wrote:

We are very excited to propose an idea that brings the Phoenix community
together to have technical discussions on a recurring basis. The goal is to
have a forum where we share technical knowledge we have acquired by working
on various aspects of Phoenix and to continue to bring innovation and
improvements as a community into Phoenix. We’d love to get feedback on this
idea and determine the logistics for these meetings.

Here is what we were thinking:

- Come together as a community by hosting *Phoenix tech talks* once a
month
- The topics for these meetings can be any technical subject related to
Phoenix, including the architecture, internals, features and interfaces of
Phoenix, its operational aspects in the first party data centers and cloud,
the technologies that it leverages (e.g., HBase and Zookeeper), and
technologies it can possibly leverage, adapt or follow

*Logistics*:

- *When*: First Thursday of each month at 9AM PST
- *Duration*: 90 minutes (to allow the audience to participate and ask
questions)
- We will conduct these meetings over a video conference and make the
recordings available (we are sorting out the specifics)
- The meeting agenda and past recordings will be available on the Apache
Phoenix site

We need a coordinator for these meetings to set the agenda and manage its
logistics. I will volunteer to organize these meetings and curate the
topics for the tech talks, at least initially. To get the ball rolling, I
will present the strongly consistent global indexes in the first meeting.
What do you think about this proposal?

Thanks,
Kadir



[jira] [Updated] (PHOENIX-5586) Add documentation for Splittable SYSTEM.CATALOG

2021-02-04 Thread Chinmay Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5586:
--
Fix Version/s: (was: 4.16.0)
   4.16.1

> Add documentation for Splittable SYSTEM.CATALOG
> ---
>
> Key: PHOENIX-5586
> URL: https://issues.apache.org/jira/browse/PHOENIX-5586
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.15.0, 5.1.0
>Reporter: Chinmay Kulkarni
>Priority: Blocker
> Fix For: 5.1.1, 4.16.1
>
>
> There are many changes after PHOENIX-3534 especially for backwards 
> compatibility. There are additional configurations such as 
> "phoenix.allow.system.catalog.rollback" which allows rollback of splittable 
> SYSTEM.CATALOG, etc. We should document these changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[Discuss] Phoenix Tech Talks

2021-02-04 Thread Kadir Ozdemir
We are very excited to propose an idea that brings the Phoenix community
together to have technical discussions on a recurring basis. The goal is to
have a forum where we share technical knowledge we have acquired by working
on various aspects of Phoenix and to continue to bring innovation and
improvements as a community into Phoenix. We’d love to get feedback on this
idea and determine the logistics for these meetings.

Here is what we were thinking:

   - Come together as a community by hosting *Phoenix tech talks* once a
   month
   - The topics for these meetings can be any technical subject related to
   Phoenix, including the architecture, internals, features and interfaces of
   Phoenix, its operational aspects in the first party data centers and cloud,
   the technologies that it leverages (e.g., HBase and Zookeeper), and
   technologies it can possibly leverage, adapt or follow

*Logistics*:

   - *When*: First Thursday of each month at 9AM PST
   - *Duration*: 90 minutes (to allow the audience to participate and ask
   questions)
   - We will conduct these meetings over a video conference and make the
   recordings available (we are sorting out the specifics)
   - The meeting agenda and past recordings will be available on the Apache
   Phoenix site

We need a coordinator for these meetings to set the agenda and manage its
logistics. I will volunteer to organize these meetings and curate the
topics for the tech talks, at least initially. To get the ball rolling, I
will present the strongly consistent global indexes in the first meeting.
What do you think about this proposal?

Thanks,
Kadir


Re: [VOTE] Release of Apache Phoenix 5.1.0 RC2

2021-02-04 Thread Istvan Toth
The pherf cleanup PRs are ready at
https://issues.apache.org/jira/browse/PHOENIX-6114.

I will start the next RC as soon as they are approved (or vetoed)

regards
Istvan

On Thu, Feb 4, 2021 at 8:18 AM Istvan Toth  wrote:

> I've pushed the addendum to fix the phoenix-pherf test failure to master
> and 4.x  (but not to 4.16 which also needs it)
>
> However, on closer inspection, we still have a lot of problems with pherf:
>
> pherf-cluster.py seems to have been missed in the python3 support pass,
> and doesn't even start
> I highly suspect that even if it was able to start, we didn't keep lib
> updated with all the necessary dependencies for it to work properly
> Additionally, I don't even see the advantage over pherf-standalone, and it
> is the only reason we have a huge /lib directory of out-of date
> jars in the assembly, so I think we should just remove it.
>
> We also generate a shaded phoenix-pherf-minimal JAR, which needs
> phoenix-compat-hbase to work, but we don't not
> publish multiple versions of, and it is also dubious if it even works at
> all with the recent changes.
>
> I will try to solve the above problem by creating a shaded phoenix-pherf
> jar that works like phoenix-queryserver does, and removing
> phoenix-cluster.py, and 95% of the contents of the /lib dir in the
> assembly.
>
> Now that I enumerated all the pherf problems we have, I think it's worth
> delaying the next RC by a day or two to fix these problems.
>
> regards
> Istvan
>
> On Thu, Feb 4, 2021 at 4:58 AM Istvan Toth  wrote:
>
>> -1 because of the Pherf test classpath regression
>>
>> On Thu, Feb 4, 2021 at 4:57 AM Istvan Toth  wrote:
>>
>>> The above exception usually happens when you use the official upstream
>>> HBase artifacts for building.
>>> See BUILDING.md on how to rebuild HBase for Hadoop 3.
>>>
>>> However, I also broke phoenix-pherf's test classpath, which probably
>>> affects 4.x, too.
>>>
>>> Expect an addendum to PHOENIX-6360 and an RC3 coming soon.
>>>
>>> Sorry for all the RC noise.
>>>
>>> Istvan
>>>
>>> On Thu, Feb 4, 2021 at 4:10 AM Xinyi Yan  wrote:
>>>
 Huh. With my JDK8 environment, mvn clean install -DskipTests doesn't
 have
 an issue. However, mvn clean verify -Dhbase.profile=2.4 seems to have a
 problem, see the following:


 ---
 Test set:

 org.apache.phoenix.hbase.index.write.recovery.TestPerRegionIndexWriteCache

 ---
 Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 14.021 s
 <<< FAILURE! - in

 org.apache.phoenix.hbase.index.write.recovery.TestPerRegionIndexWriteCache

 testMultipleRegions(org.apache.phoenix.hbase.index.write.recovery.TestPerRegionIndexWriteCache)
  Time elapsed: 0.342 s  <<< ERROR!
 java.lang.*IncompatibleClassChangeError*: Found interface
 org.apache.hadoop.hdfs.protocol.HdfsFileStatus, but class was expected
 at
 org.apache.hadoop.hbase.io
 .asyncfs.FanOutOneBlockAsyncDFSOutputHelper.createOutput(FanOutOneBlockAsyncDFSOutputHelper.java:536)
 at
 org.apache.hadoop.hbase.io
 .asyncfs.FanOutOneBlockAsyncDFSOutputHelper.access$400(FanOutOneBlockAsyncDFSOutputHelper.java:112)
 at
 org.apache.hadoop.hbase.io
 .asyncfs.FanOutOneBlockAsyncDFSOutputHelper$8.doCall(FanOutOneBlockAsyncDFSOutputHelper.java:616)
 at
 org.apache.hadoop.hbase.io
 .asyncfs.FanOutOneBlockAsyncDFSOutputHelper$8.doCall(FanOutOneBlockAsyncDFSOutputHelper.java:611)
 at

 org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at
 org.apache.hadoop.hbase.io
 .asyncfs.FanOutOneBlockAsyncDFSOutputHelper.createOutput(FanOutOneBlockAsyncDFSOutputHelper.java:624)
 at
 org.apache.hadoop.hbase.io
 .asyncfs.AsyncFSOutputHelper.createOutput(AsyncFSOutputHelper.java:53)
 at

 org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.initOutput(AsyncProtobufLogWriter.java:180)
 at

 org.apache.hadoop.hbase.regionserver.wal.AbstractProtobufLogWriter.init(AbstractProtobufLogWriter.java:166)
 at

 org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createAsyncWriter(AsyncFSWALProvider.java:113)
 at

 org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:662)
 at

 org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:130)
 at

 org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:848)
 at

 org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:551)
 at

 org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.init(AbstractFSWAL.java:492)
 at

 org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:161)

[jira] [Created] (PHOENIX-6364) CSVBulkload will cause duplicate data to be queried when a global index is created for each field in the table.

2021-02-04 Thread XiaShuangQi (Jira)
XiaShuangQi created PHOENIX-6364:


 Summary: CSVBulkload will cause duplicate data to be queried when 
a global index is created for each field in the table.
 Key: PHOENIX-6364
 URL: https://issues.apache.org/jira/browse/PHOENIX-6364
 Project: Phoenix
  Issue Type: Bug
  Components: core
Affects Versions: 5.0.0, 4.13.1
Reporter: XiaShuangQi


HBase version :1.3.1

phoenix version: apache-phoenix-4.13.0-HBase-1.3

 (download from [http://phoenix.apache.org/download.html])

phoneinx client version: apache-phoenix-4.13.0-HBase-1.3

  (download from [http://phoenix.apache.org/download.html])

step 1:create table

0: jdbc:phoenix> create table testtable3(
. . . . . . . .> DATE varchar not null,
. . . . . . . .> NUM integer not null,
. . . . . . . .> SEQ_NUM integer not null,
. . . . . . . .> ACCOUNT1 varchar not null, 
. . . . . . . .> ACCOUNTDES varchar,
. . . . . . . .> FLAG varchar,
. . . . . . . .> SALL DOUBLE,
. . . . . . . .> CONSTRAINT PK PRIMARY KEY (DATE,NUM,SEQ_NUM,ACCOUNT1)
. . . . . . . .> );

step 2: upsert data with primary key 

 UPSERT INTO testtable3 (DATE,NUM,SEQ_NUM,ACCOUNT1,ACCOUNTDES,FLAG,SALL) values 
('20201001',30201001,13,'367392332','sffa1','','');
UPSERT INTO testtable3 (DATE,NUM,SEQ_NUM,ACCOUNT1,ACCOUNTDES,FLAG,SALL) values 
('20201002',30201002,14,'367392333','sffa2','','');
UPSERT INTO testtable3 (DATE,NUM,SEQ_NUM,ACCOUNT1,ACCOUNTDES,FLAG,SALL) values 
('20201003',30201003,15,'367392334','sffa3','','');
UPSERT INTO testtable3 (DATE,NUM,SEQ_NUM,ACCOUNT1,ACCOUNTDES,FLAG,SALL) values 
('20201004',30201004,16,'367392335','sffa4','','');
UPSERT INTO testtable3 (DATE,NUM,SEQ_NUM,ACCOUNT1,ACCOUNTDES,FLAG,SALL) values 
('20201005',30201005,17,'367392336','sffa5','','');
UPSERT INTO testtable3 (DATE,NUM,SEQ_NUM,ACCOUNT1,ACCOUNTDES,FLAG,SALL) values 
('20201006',30201006,18,'367392337','sffa6','','');
UPSERT INTO testtable3 (DATE,NUM,SEQ_NUM,ACCOUNT1,ACCOUNTDES,FLAG,SALL) values 
('20201007',30201007,19,'367392338','sffa7','','');
UPSERT INTO testtable3 (DATE,NUM,SEQ_NUM,ACCOUNT1,ACCOUNTDES,FLAG,SALL) values 
('20201008',30201008,20,'367392339','sffa8','','');
UPSERT INTO testtable3 (DATE,NUM,SEQ_NUM,ACCOUNT1,ACCOUNTDES,FLAG,SALL) values 
('20201009',30201009,21,'367392340','sffa9','','');
UPSERT INTO testtable3 (DATE,NUM,SEQ_NUM,ACCOUNT1,ACCOUNTDES,FLAG,SALL) values 
('20201010',30201010,22,'367392341','sffa10','','');

 step 3: create global index ,more than primary key

 CREATE INDEX testtable3_ID ON testtable3 
(ACCOUNT1,DATE,NUM,ACCOUNTDES,SEQ_NUM);

 step 3: CSVBulkload  data,primary key same as before but other filed different
|20201001|30201001|13|3.67E+08|sffa2|1231243|23|
|20201002|30201002|14|3.67E+08|sffa3|1231244|24|
|20201003|30201003|15|3.67E+08|sffa4|1231245|25|
|20201004|30201004|16|3.67E+08|sffa5|1231246|26|
|20201005|30201005|17|3.67E+08|sffa6|1231247|27|
|20201006|30201006|18|3.67E+08|sffa7|1231248|28|
|20201007|30201007|19|3.67E+08|sffa8|1231249|29|
|20201008|30201008|20|3.67E+08|sffa9|1231250|30|
|20201009|30201009|21|3.67E+08|sffa10|1231251|31|
|20201010|30201010|22|3.67E+08|sffa11|1231252|32|

step 4:select data 

select DATE,NUM,SEQ_NUM,ACCOUNT1 from testtable3;
+---+---+--++
| DATE | NUM | SEQ_NUM | ACCOUNT1 |
+---+---+--++
| 20201001 | 20201001 | 13 | 367392332 |
| 20201001 | 30201001 | 13 | 367392332 |
| 20201001 | 30201001 | 13 | 367392332 |
| 20201002 | 30201002 | 14 | 367392333 |
| 20201002 | 30201002 | 14 | 367392333 |
| 20201003 | 30201003 | 15 | 367392334 |
| 20201003 | 30201003 | 15 | 367392334 |
| 20201004 | 30201004 | 16 | 367392335 |
| 20201004 | 30201004 | 16 | 367392335 |
| 20201005 | 30201005 | 17 | 367392336 |
| 20201005 | 30201005 | 17 | 367392336 |
| 20201006 | 30201006 | 18 | 367392337 |
| 20201006 | 30201006 | 18 | 367392337 |
| 20201007 | 30201007 | 19 | 367392338 |
| 20201007 | 30201007 | 19 | 367392338 |
| 20201008 | 30201008 | 20 | 367392339 |
| 20201008 | 30201008 | 20 | 367392339 |
| 20201009 | 30201009 | 21 | 367392340 |
| 20201009 | 30201009 | 21 | 367392340 |
| 20201010 | 30201010 | 22 | 367392341 |
| 20201010 | 30201010 | 22 | 367392341 |
+---+---+--++

and we can see index data :

0: jdbc:phoenix> select * from testtable3_ID;
2021-02-04 19:50:48,685 | INFO | hconnection-0x3943a2be-shared--pool1-t352 | 
RPC Server Kerberos principal name for service=ClientService is 
hbase/hadoop.hadoop@hadoop.com | 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.processPreambleResponse(RpcClientImpl.java:824)
2021-02-04 19:50:48,699 | INFO | hconnection-0x3943a2be-shared--pool1-t353 | 
RPC Server Kerberos principal name for service=ClientService is 
hbase/hadoop.hadoop@hadoop.com | 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.processPreambleResponse(RpcClientImpl.java:824)

[jira] [Updated] (PHOENIX-6362) Remove pherf-cluster.py

2021-02-04 Thread Istvan Toth (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Toth updated PHOENIX-6362:
-
Parent: PHOENIX-6114
Issue Type: Sub-task  (was: Improvement)

> Remove pherf-cluster.py
> ---
>
> Key: PHOENIX-6362
> URL: https://issues.apache.org/jira/browse/PHOENIX-6362
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 5.1.0
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>
> pherf-cluster.py is supposed to use the JARs in /lib instead of 
> phoenix-client.
> At least for master we know that the libraries in /lib haven't been 
> maintained in a long time, and do not contain everything needed to run the 
> client.
> I also do not see the advantage over pherf-standalone, which uses the shaded 
> client, which is already available anyway.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (PHOENIX-6362) Remove pherf-cluster.py

2021-02-04 Thread Istvan Toth (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Toth reassigned PHOENIX-6362:


Assignee: Istvan Toth

> Remove pherf-cluster.py
> ---
>
> Key: PHOENIX-6362
> URL: https://issues.apache.org/jira/browse/PHOENIX-6362
> Project: Phoenix
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 5.1.0
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>
> pherf-cluster.py is supposed to use the JARs in /lib instead of 
> phoenix-client.
> At least for master we know that the libraries in /lib haven't been 
> maintained in a long time, and do not contain everything needed to run the 
> client.
> I also do not see the advantage over pherf-standalone, which uses the shaded 
> client, which is already available anyway.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)