Re: Tech Talk on Thursday : Phoenix High Availability

2021-07-05 Thread Josh Elser

Sorry I missed this one! I'll have to catch the recording :)

On 6/29/21 1:30 PM, Kadir Ozdemir wrote:

Hi All,

I am hoping that you will be able to join the tech talk meeting 
on Thursday. Mingliang Liu, Daniel Wong, and Abhishek Singh Chouhan will 
introduce a new feature, Phoenix High Availability (HA) which allows 
Phoenix users to interact with multiple Phoenix/HBase clusters in order 
to achieve additional availability. Please see 
https://phoenix.apache.org/__tech_talks.html 
 for 
details.


Thanks,
Kadir


Re: Phoenix Error - ERROR 504 (4203)

2021-06-22 Thread Josh Elser
Assuming you mean 5.1.2 (as there is no such release 5.2.1) and without 
any other data that would be necessary to properly diagnose what happened...


I would assume that the upgrade code inside of Phoenix to automatically 
upgrade the Phoenix system tables is not effective across such a large 
release gap. (4.8 was released approximately 5 years ago)


Unless you are prepared to look at Phoenix code and perform surgery on 
your system.catalog table, I'd suggest that dropping the system tables 
via HBase, re-creating the system tables by connecting, and then 
recreating your Phoenix tables using `CREATE TABLE` is going to be less 
painful.


Of other things, beware of the change where column encoding is turned on 
by default in Phoenix 5.x http://phoenix.apache.org/columnencoding.html. 
You would need to disable column encoding, assuming that you retain the 
original data layout (e.g. COLUMN_ENCODED_BYTES=0)


On 6/22/21 8:57 AM, Ankit Joshi wrote:

Hello Team,

Getting error while upgrading from 4.8.2 to 5.2.1.

ERROR 504 (42703) : Undefined column: 
ColumnName=SYSTEM.CATALOG.COLUMN_QUALIFIER

Failed upgrading System tables.

Kindly help.

Thanks &  Regards,
Ankit Joshi


Re: [DISCUSS] Unbundling Sqlline and slf4j backend from phoenix-client and phoenix-client embedded

2021-05-26 Thread Josh Elser
I think the idea is that we would include a sqlline jar with the Phoenix 
distribution. Context: we had some grief where a sqlline upgrade caused 
user pain because they were relying on specific output from sqlline.


If we have the sqlline jar _not_ packaged inside phoenix-client, then 
users can easily replace the version of sqlline which makes them happiest.


While I agree with Istvan that #1 is the more "correct" option, I'm 
worried about the impact of folks who rely on the phoenix-client.jar to 
be a "batteries included" fat-jar. Removing sqlline from 
phoenix-client-embedded is great, so I'd lean towards #2.


We can see what adoption of phoenix-client-embedded looks like now that 
we have it in releases. I imagine most folks haven't yet realized that 
it's even an option that's available.


On 5/26/21 1:16 PM, la...@apache.org wrote:

Will sqlline still be part of the Phoenix "distribution"? Or will it become a 
separate package to install?






On Wednesday, May 26, 2021, 1:07:17 AM PDT, Istvan Toth  
wrote:





Hi!

The current purpose of the phoenix-client JAR is twofold:
- It servers as a generic JDBC driver for embedding in applications
- It also contains the sqlline library used by the sqlline.py script, as
well as the slf4j log4j backend.
- (It also contains a some Phoenix code and HBase libraries not necessary
for a client, but we're already tracking that in different tickets)

One major pain point is the slf4j backend, which makes phoenix-client
incompatible with applications and libraries that do not use log4j 1.2 as a
backend, and kind of defeats the purpose of using slf4j in the first place.
phoenix-client-embedded solves this problem by removing the slf4j backend
from Phoenix.

In PHOENIX-6378  we aim
to remove sqlline from the phoenix-client JAR, as it further cleans up the
classpath, and avoids locking phoenix to the sqlline version that it was
built with.

In Richard's current patch, we remove sqlline from phoenix-client-embedded,
and use that in the sqlline script.

In our quest for a more useable phoenix-client, we can do two things now:

   1. Remove both the slf4j backend, and sqlline from phoenix-client, and
   also drop phoenix-client-embedded as it would be the same as phoenix-client
   2. Remove sqlline from phoenix-client-embedded, and keep the current
   phoenix-client as backwards compatibility option

I'd prefer the first option, but this is somewhat more disruptive than the
other.

Please share your thoughts. Do you prefer option 1, 2, or something else
entirely ?

Istvan



Re: Advice wanted on supporting a tag feature for searching an HBase Table via Phoenix

2021-02-22 Thread Josh Elser
I had a similar sort of issues (granted, less data scale), and I went 
with option 2.


If you put the rowkey of your "data" table plus the tag itself into the 
rowkey for your other table/index, you should be able to grow without 
running into HBase scalability (though, pulling 10GB of tags for one 
lookup would be crazy slow :P). It's a fast rowkey, prefix scan to pull 
all the tags for the "data record".


Just don't forget that hbase won't split a single row across multiple 
Regions. That's the important part in designing this table.


On 2/21/21 11:51 PM, Simon Mottram wrote:
The requirement is to be able to search from a list of tags, each record 
can have a possible large number of tags.  There would be more than one 
tag field.


An example might 3 different hashtag fields.  They do have to be 
different; we can't have just one tag cloud.


The data size is large so we need to be able to search the tag clouds 
over large numbers.  Millions but not billions (for now)


e.g:

I was wondering what the best method would be

1) a column per tag value.
ID, name, some_attributes..., type1_tag_1,  type1_tag_2

While hbase is happy with many columns I can't see how to index this

2) A tag join table.  Maybe just a single row key  ID + single tag.  
Then it becomes a straight join of ID + tag.   Thus it would be indexed.


3) Is there a crafty way of using column families?  Could that be 
indexed efficiently?


Any tips/tricks gratefully received

Simon


Re: [ANNOUNCE] New Phoenix committer Richárd Antal

2021-01-06 Thread Josh Elser

Congrats, Richard!

On 1/4/21 3:31 PM, Ankit Singhal wrote:
On behalf of the Apache Phoenix PMC, I'm pleased to announce that 
Richárd Antal

has accepted the PMC's invitation to become a committer on Apache Phoenix.

We appreciate all of the great contributions Richárd has made to the
community thus far and we look forward to his continued involvement.

Congratulations and welcome, Richárd Antal!




Re: Phoenix connection issue

2020-10-26 Thread Josh Elser
If you are not using Kerberos authentication, then you need to figure 
out how your DNS names are resolving for your VM. A 
SocketTimeoutException means that your client could not open an socket 
to that host:port which was specified. Likely, this is something you 
need to figure out with your VM.


On 10/23/20 10:29 AM, Varun Rajavelu wrote:

Hey Hi,

Thanks for the response. Yeah I'm not using in any Kerberos 
authentication hbase still in simple authentication method only. Can you 
please tell me what I'm made any mistakes in these presto catalog 
configuration, while communication between hbase and presto. Please help 
me or suggestions on this issue. I can able to query hbase table from 
phoenix but issue comes only from connecting presto phoenix only.


On Fri, 23 Oct, 2020, 7:27 pm Josh Elser, <mailto:els...@apache.org>> wrote:


(-to: dev@phoenix, +bcc: dev@phoenix, +to: user@phoenix)

I've taken the liberty of moving this over to the user list.

Typically, such an exception is related to Kerberos authentication,
when
the HBase service denies in an incoming, non-authenticated client.
However, since you're running inside of a VM, I'd suggest you should
validate that your environment is sane, first.

HDP 2.6 never shipped an HBase 1.5 release, so it seems like you have
chosen to own your own set of versions. I'll assume that you have
already validated that the versions of Hadoop, HBase, and Phoenix that
you are running are compatible with one another. A common debugging
trick is to strip away the layers of complexity so that you can isolate
your problem. Can you talk to HBase or Phoenix not within Presto?

On 10/23/20 8:58 AM, Varun Rajavelu wrote:
 > Hi,,
 >
 > I'm currently using HDP 2.6 and hbase1.5 and phoenix 4.7 and
presto-server
 > which i have been using presto344. Im facing issue while connecting
 > prestoserver with hbase;
 >
 > *./presto --server localhost:14100 --catalog phoenix*
 > *Issue:*
 > Fri Oct 23 18:11:06 IST 2020, null, java.net
<http://java.net>.SocketTimeoutException:
 > callTimeout=6, callDuration=69066: Call to
 > sandbox-hdp.hortonworks.com/127.0.0.1:16020
<http://sandbox-hdp.hortonworks.com/127.0.0.1:16020> failed on local
exception:
 > java.io.EOFException row 'SYSTEM:CATALOG,,' on table 'hbase:meta' at
 > region=hbase:meta,,1.1588230740,
 > hostname=sandbox-hdp.hortonworks.com
<http://sandbox-hdp.hortonworks.com>,16020,1603447852070,
 > seqNum=0
 >
 > *My presto catalog Config:*
 > connector.name <http://connector.name>=phoenix
 > phoenix.connection-url=jdbc:phoenix:localhost:2181:/hbase-unsecure
 >
phoenix.config.resources=/home/desa/Downloads/hbase_schema/hbase-site.xml
 >
 > Kindly please have a look and help me to resolve this issue.
 >



Re: Phoenix connection issue

2020-10-23 Thread Josh Elser

(-to: dev@phoenix, +bcc: dev@phoenix, +to: user@phoenix)

I've taken the liberty of moving this over to the user list.

Typically, such an exception is related to Kerberos authentication, when 
the HBase service denies in an incoming, non-authenticated client. 
However, since you're running inside of a VM, I'd suggest you should 
validate that your environment is sane, first.


HDP 2.6 never shipped an HBase 1.5 release, so it seems like you have 
chosen to own your own set of versions. I'll assume that you have 
already validated that the versions of Hadoop, HBase, and Phoenix that 
you are running are compatible with one another. A common debugging 
trick is to strip away the layers of complexity so that you can isolate 
your problem. Can you talk to HBase or Phoenix not within Presto?


On 10/23/20 8:58 AM, Varun Rajavelu wrote:

Hi,,

I'm currently using HDP 2.6 and hbase1.5 and phoenix 4.7 and presto-server
which i have been using presto344. Im facing issue while connecting
prestoserver with hbase;

*./presto --server localhost:14100 --catalog phoenix*
*Issue:*
Fri Oct 23 18:11:06 IST 2020, null, java.net.SocketTimeoutException:
callTimeout=6, callDuration=69066: Call to
sandbox-hdp.hortonworks.com/127.0.0.1:16020 failed on local exception:
java.io.EOFException row 'SYSTEM:CATALOG,,' on table 'hbase:meta' at
region=hbase:meta,,1.1588230740,
hostname=sandbox-hdp.hortonworks.com,16020,1603447852070,
seqNum=0

*My presto catalog Config:*
connector.name=phoenix
phoenix.connection-url=jdbc:phoenix:localhost:2181:/hbase-unsecure
phoenix.config.resources=/home/desa/Downloads/hbase_schema/hbase-site.xml

Kindly please have a look and help me to resolve this issue.



Re: Unable to use some functions on DECIMAL columns

2020-08-03 Thread Josh Elser
Any stacktrace? An error message on its own isn't too helpful (as it 
could come from any number of places).


We have lots of unit tests in the project. Ideally, a unit test which 
illustrates the error is the best way for someone to start poking at the 
problem. Figuring out if it's a problem unique to the thin-client or if 
it affects both the thin-client and thick-client is important.


On 7/31/20 9:10 PM, Simon Mottram wrote:

Hi Josh

Thanks very much for the reply, I did share the error at the top of the 
email


SQL Error [0]: Error -1 (0) : Error while executing SQL "ESELECT 
PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY DECIMALCOLUMN ASC) FROM 
TEST.TESTEXCEPTIONS": Remote driver error: 
ArrayIndexOutOfBoundsException: (null exception message)


There's a few SQL statements to reliably reproduce the error, I thought 
that would be enough but I'm very open to providing any assistance I 
can.  What further info would be helpful?


Cheers

S
----
*From:* Josh Elser 
*Sent:* 01 August 2020 2:26 AM
*To:* user@phoenix.apache.org 
*Subject:* Re: Unable to use some functions on DECIMAL columns
Simon,

If you have clear bug report, please open up a Jira issue for it. Keep
in mind that as much information as you can provide to indicate the
problem you see, the better. Assume that whoever might read your Jira
issue is coming from zero-context. Right now, you haven't shared any
error, so you're expecting a bit from someone to both help you by first
reproducing the code you've shared, analyze if an error is expected, and
then fix it.

Ideally, you can submit a patch to try to fix the issues you're seeing ;)

On 7/29/20 10:18 PM, Simon Mottram wrote:


If you try the queries marked with BUG below, you get an exception

SQL Error [0]: Error -1 (0) : Error while executing SQL "ESELECT 
PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY DECIMALCOLUMN ASC) FROM 
TEST.TESTEXCEPTIONS": Remote driver error: 
ArrayIndexOutOfBoundsException: (null exception message)


As far as I can tell I am using the functions correctly.

Best Regards

Simon

To reproduce:

Using HBase
HBASE_VERSION=2.0.0
HBASE_MINOR_VERSION=2.0
PHOENIX_VERSION=5.0.0

Connecting using thin client: phoenix-5.0.0-HBase-2.0-thin-client.jar

NOTE: We can't use thick client as I haven't resolved issues connecting 
my API which runs inside docker.  That's another story.


CREATE TABLE IF NOT EXISTS TEST.TESTEXCEPTIONS ( KEYCOL VARCHAR NOT NULL 
PRIMARY KEY, INTEGERCOLUMN INTEGER , DECIMALCOLUMN DECIMAL);


UPSERT INTO TEST.TESTEXCEPTIONS(KEYCOL, INTEGERCOLUMN, DECIMALCOLUMN) 
VALUES('A', 1, 1.1);
UPSERT INTO TEST.TESTEXCEPTIONS(KEYCOL, INTEGERCOLUMN, DECIMALCOLUMN) 
VALUES('B', 2, 2.2);
UPSERT INTO TEST.TESTEXCEPTIONS(KEYCOL, INTEGERCOLUMN, DECIMALCOLUMN) 
VALUES('C', 3, 3.3);


-- PERCENTILE_DISC
-- Integer columns works
SELECT PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY INTEGERCOLUMN ASC) 
FROM TEST.TESTEXCEPTIONS;

-- BUG: Decimal columns throws NPE
SELECT PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY DECIMALCOLUMN ASC) 
FROM TEST.TESTEXCEPTIONS;


-- STDDEV_POP
-- Integer columns works
SELECT STDDEV_POP(INTEGERCOLUMN) FROM TEST.TESTEXCEPTIONS;

-- BUG:  Decimal columns throws NPE
SELECT STDDEV_POP(DECIMALCOLUMN) FROM TEST.TESTEXCEPTIONS;

-- STDDEV_SAMP
-- Integer columns works
SELECT STDDEV_SAMP(INTEGERCOLUMN) FROM TEST.TESTEXCEPTIONS;

-- BUG:  Decimal columns throws NPE
SELECT STDDEV_SAMP(DECIMALCOLUMN) FROM TEST.TESTEXCEPTIONS;


-- PERCENTILE_CONT
-- Integer columns works
SELECT PERCENTILE_CONT (0.5) WITHIN GROUP (ORDER BY INTEGERCOLUMN ASC) 
FROM TEST.TESTEXCEPTIONS;

-- Decimal columns works
SELECT PERCENTILE_CONT (0.5) WITHIN GROUP (ORDER BY DECIMALCOLUMN ASC) 
FROM TEST.TESTEXCEPTIONS;


Re: Unable to use some functions on DECIMAL columns

2020-07-31 Thread Josh Elser

Simon,

If you have clear bug report, please open up a Jira issue for it. Keep 
in mind that as much information as you can provide to indicate the 
problem you see, the better. Assume that whoever might read your Jira 
issue is coming from zero-context. Right now, you haven't shared any 
error, so you're expecting a bit from someone to both help you by first 
reproducing the code you've shared, analyze if an error is expected, and 
then fix it.


Ideally, you can submit a patch to try to fix the issues you're seeing ;)

On 7/29/20 10:18 PM, Simon Mottram wrote:


If you try the queries marked with BUG below, you get an exception

SQL Error [0]: Error -1 (0) : Error while executing SQL "ESELECT 
PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY DECIMALCOLUMN ASC) FROM 
TEST.TESTEXCEPTIONS": Remote driver error: 
ArrayIndexOutOfBoundsException: (null exception message)


As far as I can tell I am using the functions correctly.

Best Regards

Simon

To reproduce:

Using HBase
HBASE_VERSION=2.0.0
HBASE_MINOR_VERSION=2.0
PHOENIX_VERSION=5.0.0

Connecting using thin client: phoenix-5.0.0-HBase-2.0-thin-client.jar

NOTE: We can't use thick client as I haven't resolved issues connecting 
my API which runs inside docker.  That's another story.


CREATE TABLE IF NOT EXISTS TEST.TESTEXCEPTIONS ( KEYCOL VARCHAR NOT NULL 
PRIMARY KEY, INTEGERCOLUMN INTEGER , DECIMALCOLUMN DECIMAL);


UPSERT INTO TEST.TESTEXCEPTIONS(KEYCOL, INTEGERCOLUMN, DECIMALCOLUMN) 
VALUES('A', 1, 1.1);
UPSERT INTO TEST.TESTEXCEPTIONS(KEYCOL, INTEGERCOLUMN, DECIMALCOLUMN) 
VALUES('B', 2, 2.2);
UPSERT INTO TEST.TESTEXCEPTIONS(KEYCOL, INTEGERCOLUMN, DECIMALCOLUMN) 
VALUES('C', 3, 3.3);


-- PERCENTILE_DISC
-- Integer columns works
SELECT PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY INTEGERCOLUMN ASC) 
FROM TEST.TESTEXCEPTIONS;

-- BUG: Decimal columns throws NPE
SELECT PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY DECIMALCOLUMN ASC) 
FROM TEST.TESTEXCEPTIONS;


-- STDDEV_POP
-- Integer columns works
SELECT STDDEV_POP(INTEGERCOLUMN) FROM TEST.TESTEXCEPTIONS;

-- BUG:  Decimal columns throws NPE
SELECT STDDEV_POP(DECIMALCOLUMN) FROM TEST.TESTEXCEPTIONS;

-- STDDEV_SAMP
-- Integer columns works
SELECT STDDEV_SAMP(INTEGERCOLUMN) FROM TEST.TESTEXCEPTIONS;

-- BUG:  Decimal columns throws NPE
SELECT STDDEV_SAMP(DECIMALCOLUMN) FROM TEST.TESTEXCEPTIONS;


-- PERCENTILE_CONT
-- Integer columns works
SELECT PERCENTILE_CONT (0.5) WITHIN GROUP (ORDER BY INTEGERCOLUMN ASC) 
FROM TEST.TESTEXCEPTIONS;

-- Decimal columns works
SELECT PERCENTILE_CONT (0.5) WITHIN GROUP (ORDER BY DECIMALCOLUMN ASC) 
FROM TEST.TESTEXCEPTIONS;


Re: Add pseudo columns to an existing view

2020-07-31 Thread Josh Elser

Yes. Please see the docs which illustrate how to create views.

On 7/28/20 2:12 AM, Arun J wrote:

Team,
Can we create or alter a view to add a column which is pseudo columns 
such as


# FULLNAME from other columns such as FIRSTNAME + LASTNAME
# NUMVALUE from other columns TO_NUMBER(STRVALUE)

This would be quite helpful to avoid changing data types everytime in 
the client query.


Thanks in advance.
JAK


Re: how to connect phoenix cluster enabled with Kerberos using Java JDBC

2020-07-31 Thread Josh Elser

You're missing a colon between the port and root znode in your JDBC URL.

From http://phoenix.apache.org/

```
jdbc:phoenix [ : [ : [ : [ 
: [ : ] ] ] ] ]

```

On 7/23/20 4:24 AM, Istvan Toth wrote:

The code looks OK.
Check that you can resolve the name of, and have IP connectivity to 
*each *HBase host (master/regionserver) in the cluster.


regards
Istvan

On Wed, Jul 22, 2020 at 3:01 PM 黄乐平 <18702515...@163.com 
> wrote:


My code is like this:

public class PhoenixDemo {

 public static void main(String[] args) {
 Connection connection =null;
 Statement statement =null;
 ResultSet rs =null;
 PreparedStatement ps =null;
 org.apache.hadoop.conf.Configuration conf =null;

try {
 Connection conn =null;
 Class.forName("org.apache.phoenix.jdbc.PhoenixDriver");
 connection = 
DriverManager.getConnection("jdbc:phoenix:cdp2.hadoop.com:2181/hbase:hb...@hadoop.com:C
:\\hbase.keytab");
 System.out.println("Connection established");
 // Create a JDBC statement
 statement = connection.createStatement();
 // Execute our statements
 statement.executeUpdate(
 "create table user (id INTEGERNOT NULL PRIMARY KEY, 
d.first_name
VARCHAR,d.last_name VARCHAR)");
 statement.executeUpdate("upsert into user values 
(1,'John','Mayer')");
 statement.executeUpdate("upsert into user values 
(2,'Eva','Peters')");
 connection.commit();

 // Query for selecting records from table
 ps = connection.prepareStatement("select *from user");
 rs = ps.executeQuery();
 System.out.println("Table Values");
while (rs.next()) {
 Integer id = rs.getInt(1);
 String name = rs.getString(2);
 System.out.println("\tRow: " + id +" = " + name);
 }
 }catch (SQLException | ClassNotFoundException e) {
 e.printStackTrace();
 }finally {
 if (ps !=null) {
 try {
 ps.close();
 }catch (Exception e) {
 }
 }
 if (rs !=null) {
 try {
 rs.close();
 }catch (Exception e) {
 }
 }
 if (statement !=null) {
 try {
 statement.close();
 }catch (Exception e) {
 }
 }
 if (connection !=null) {
 try {
 connection.close();
 }catch (Exception e) {
 }
 }
 }

 }
}

  the code running result is a long time no response. are there any
errors with my code?

黄乐平
18702515...@163.com




签名由 网易邮箱大师
 定制



Re: Too many connections from / - max is 60

2020-06-03 Thread Josh Elser
ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
         at sun.nio.ch.IOUtil.read(IOUtil.java:192)
         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
         at 
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
         at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
         at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)/


Thanks!

On Tue, Jun 2, 2020 at 6:57 AM Josh Elser <mailto:els...@apache.org>> wrote:


HBase (daemons) try to use a single connection for themselves. A RS
also
does not need to mutate state in ZK to handle things like gets and puts.

Phoenix is probably the thing you need to look at more closely
(especially if you're using an old version of Phoenix that matches the
old HBase 1.1 version). Internally, Phoenix acts like an HBase client
which results in a new ZK connection. There have certainly been bugs
like that in the past (speaking generally, not specifically).

On 6/1/20 5:59 PM, anil gupta wrote:
 > Hi Folks,
 >
 > We are running in HBase problems due to hitting the limit of ZK
 > connections. This cluster is running HBase 1.1.x and ZK 3.4.6.x
on I3en ec2
 > instance type in AWS. Almost all our Region server are listed in
zk logs
 > with "Too many connections from / - max is 60".
 > 2020-06-01 21:42:08,375 - WARN  [NIOServerCxn.Factory:
 > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@193
<http://0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@193>] - Too many
connections from
 > / - max is 60
 >
 >   On a average each RegionServer has ~250 regions. We are also
running
 > Phoenix on this cluster. Most of the queries are short range
scans but
 > sometimes we are doing full table scans too.
 >
 >    It seems like one of the simple fix is to increase maxClientCnxns
 > property in zoo.cfg to 300, 500, 700, etc. I will probably do
that. But, i
 > am just curious to know In what scenarios these connections are
 > created/used(Scans/Puts/Delete or during other RegionServer
operations)?
 > Are these also created by hbase clients/apps(my guess is NO)? How
can i
 > calculate optimal value of maxClientCnxns for my cluster/usage?
 >



--
Thanks & Regards,
Anil Gupta


Re: Error Parameter value unbound. Parameter at index 1 is unbound on Subquery

2020-05-29 Thread Josh Elser

Gentle reminder to always do a general search for similar issues

https://issues.apache.org/jira/browse/PHOENIX-5192

There have been some issues which have been similarly reported in the 
past, but no one has been able to provide a reproduction. Perhaps you 
can be the one to do that. As of now, your reported issue is not 
actionable as we don't know where it is.


Finally, you're using an extremely old version of HBase and Phoenix. You 
should upgrade.


On 5/29/20 3:11 AM, Ganda Manurung wrote:


Hello,

I have a query like this on my program

SELECT COUNT(*) AS COUNTER from ( SELECT FULLNAME FROM TABLE_A WHERE ID 
= '1' AND STATUS = 'N'  UNION ALL SELECT FULLNAME FROM TABLE_B WHERE ID 
= '1' AND STATUS = 'Y'  UNION ALL  SELECT FULLNAME FROM TABLE_C WHERE ID 
= '1' AND STATUS = 'N' ) AS TEMP


And it runs smoothly when I tried it using SquirrelSQL with Phoenix JDBC 
Thin Client.


However, I try the query in Java with a prepared statement, the query 
changed like below:


SELECT COUNT(*) AS COUNTER from ( SELECT FULLNAME FROM TABLE_A WHERE ID 
= ? AND STATUS = 'N'  UNION ALL SELECT FULLNAME FROM TABLE_B WHERE ID = 
? AND STATUS = 'Y'  UNION ALL  SELECT FULLNAME FROM TABLE_C WHERE ID = ? 
AND STATUS = 'N' ) AS TEMP


And as the ID is a string, I set the parameter with code like this

PreparedStatement secondStatement = 
super.getConnection().prepareStatement(sqlQuery);


secondStatement.setString(1, ID);

secondStatement.setString(2, ID);

secondStatement.setString(3, ID);


ResultSet secondResultset = secondStatement.executeQuery();


I expect it should be working, but I got this error



rg.apache.calcite.avatica.AvaticaSqlException: Error -1 (0) : while 
preparing SQL: SELECT COUNT(*) AS COUNTER from ( SELECT FULLNAME FROM 
TABLE_A WHERE ID = ? AND STATUS = 'N'  UNION ALL SELECT FULLNAME FROM 
TABLE_B WHERE ID = ? AND STATUS = 'Y'  UNION ALL  SELECT FULLNAME FROM 
TABLE_C WHERE ID = ? AND STATUS = 'N' ) AS TEMP

at org.apache.calcite.avatica.Helper.createException(Helper.java:53)
at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
at 
org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:314)
at 
org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:153)
at 
com.btpn.jdbc.interfacing.interfaces.impl.VerifyCIF.query(VerifyCIF.java:49)

at com.btpn.jdbc.interfacing.Fetcher.call(Fetcher.java:39)
at MainPhoenix.main(MainPhoenix.java:69)
java.lang.RuntimeException: java.sql.SQLException: ERROR 2004 (INT05): 
Parameter value unbound. Parameter at index 1 is unbound

at org.apache.calcite.avatica.jdbc.JdbcMeta.propagate(JdbcMeta.java:651)
at org.apache.calcite.avatica.jdbc.JdbcMeta.prepare(JdbcMeta.java:677)
at 
org.apache.calcite.avatica.remote.LocalService.apply(LocalService.java:177)
at 
org.apache.calcite.avatica.remote.Service$PrepareRequest.accept(Service.java:1113)
at 
org.apache.calcite.avatica.remote.Service$PrepareRequest.accept(Service.java:1091)
at 
org.apache.calcite.avatica.remote.AbstractHandler.apply(AbstractHandler.java:102)
at 
org.apache.calcite.avatica.remote.ProtobufHandler.apply(ProtobufHandler.java:38)
at 
org.apache.calcite.avatica.server.AvaticaProtobufHandler.handle(AvaticaProtobufHandler.java:68)

at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)

at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:245)
at 
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)

at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: ERROR 2004 (INT05): Parameter value 
unbound. Parameter at index 1 is unbound
at 
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422)
at 
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
at 
org.apache.phoenix.jdbc.PhoenixParameterMetaData.getParam(PhoenixParameterMetaData.java:88)
at 
org.apache.phoenix.jdbc.PhoenixParameterMetaData.isSigned(PhoenixParameterMetaData.java:138)

at org.apache.calcite.avatica.jdbc.JdbcMeta.parameters(JdbcMeta.java:231)
at org.apache.calcite.avatica.jdbc.JdbcMeta.signature(JdbcMeta.java:244)
at org.apache.calcite.avatica.jdbc.JdbcMeta.prepare(JdbcMeta.java:669)
... 15 more



How do I fix this? Is there anything wrong?

I am using Apache Phoenix 4.7.0 and Hbase 1.1


Thank you and regards,


Ganda

--
Ganda Manurung


Re: How to bulk load a csv file into a phoenix table created in lowercase

2020-05-14 Thread Josh Elser

Quoting can be really annoying.

Remember that when you are executing commands, you have to deal with 
what your shell will do to quoting.


You say that escaping the slashes doesn't work? e.g. `pqsl -t 
MICHTEST.\"StagingNotificationPreferencesRT\" ...`


What else did you try? Have you checked to see if there are any known 
issues around quoting/naming?


On 5/11/20 4:45 PM, Mich Talebzadeh wrote:

Hi,

I have a phoenix table *created in lowercase *as follows:

CREATE TABLE "MICHTEST"."StagingNotificationPreferencesRT"
(
ROWKEY VARCHAR NOT NULL PRIMARY KEY
,  "cf"."partyId" VARCHAR
, "cf"."childNotificationId" VARCHAR
, "cf"."brand" VARCHAR
, "cf"."accountReference" VARCHAR
, "cf"."expiredDate" VARCHAR
, "cf"."parentNotificationId" VARCHAR
...

This table creates OK. When I try to bulk load from a csv file at Linux 
bash command line as follows


*psql -t MICHTEST."StagingNotificationPreferencesRT" 
./StagingNotificationPreferences.csv  -s*


20/05/11 21:35:07 ERROR util.CSVCommonsLoader: Error upserting record 
[3c7953b3-0c69-42ee-9abf-fe7a77d29c87, 876543914, 1, LTB, , , 50, 
76543210, , 2017-13-18 18:29:23:345, 19, 1, , , , , , , 9876543210123456]


java.lang.RuntimeException: 
org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): 
Table undefined. tableName=MICHTEST.STAGINGNOTIFICATIONPREFERENCESRT



Note that the table name shown in error is uppercase! Now this works if 
I go back and create table all in UPPERCASE!



Some documentation in below link


https://phoenix.apache.org/bulk_dataload.html


talks about using backslash with table name etc but that does not work/


Any ideas how to bulk load into table created in mixed case!


Regards,


Mich


Dr Mich Talebzadeh

LinkedIn 
/https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw/


http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk.Any and all responsibility for any 
loss, damage or destruction of data or any other property which may 
arise from relying on this email's technical content is explicitly 
disclaimed. The author will in no case be liable for any monetary 
damages arising from such loss, damage or destruction.




Re: Latest supported HBase version

2020-04-21 Thread Josh Elser

Please read the website for information on Apache Phoenix releases.

http://phoenix.apache.org/download.html

On 4/20/20 3:35 PM, Arun J wrote:

What is the latest HBase version that Apache Phoenix 5.x support?

We are forced to explore at both CDH and non-CDH options.

CDH claims Phoenix 5.0 parcel for CDH 6.2+ but not sure of HBase version 
while CDH6.2.3 packages HBase2.1.0.  Since CDH is discontinuing the 
community version - we are exploring options outside of it hence 
checking what the latest HBase version Phoenix support.


 >> Cloudera has released a Phoenix 4.14.1 parcel 
<https://www.cloudera.com/documentation/enterprise/5/latest/topics/phoenix_installation.html>available 
to CDH 5.16.2 customers and Phoenix 5.0 parcel available to CDH 6.2+ 
customers.



On Mon, Apr 20, 2020 at 8:49 PM Josh Elser <mailto:els...@apache.org>> wrote:


You should contact Cloudera for information on how to use Phoenix on
CDH.

There is no upstream build of Apache Phoenix 5.x which was built and
tested against any release of CDH. There is only the downstream work
which Cloudera packages and provides themselves.

On 4/20/20 11:09 AM, Arun J wrote:
 >
 > All,
 >
 > I see that the latest support HBase version is 2.0 from the below
 > listing here <http://phoenix.apache.org/download.html>
 >
 > We use CDH6.3.2 which is documented to be HBase 2.14 however it uses
 > HBase 2.1.0.  Looking to use Phoenix on this CDH ecosystem but
not sure
 > whether there is a supported version against this HBase (2.1). 
  This

 > CDH version is the last community version and I am thinking many
would
 > be on this HBase (2.1) version.
 >
 > There are some aggregated functions that is supported in HBase 2x
and I
 > am curious to see whether Phoenix supports and leverages those
and if
 > there is a supported version on 2.1.  I saw this thread but I
couldn't
 > get anything concrete finalized.
 >
 >

https://lists.apache.org/thread.html/210c38fad6b42251bfac67d778577e356944dea7f9baa20526d64c76%40%3Cdev.phoenix.apache.org%3E
 >
 > I also found the release date is July 2018 for HBase 2.0 whereas
 > HBase1.5 is Dec 2019.  Is this release date accurate.  I am little
 > confused with the  chronology of dates below.  Any release
further to
 > this on HBase 2.x?
 >
 > TIA,
 > JAK
 >
 > Version       Release Date
 > 5.0.0-HBase-2.0       04/jul/2018
 >
 > 4.15.0-HBase-1.5      20/dec/2019
 >
 > 4.15.0-HBase-1.4      20/dec/2019
 >



Re: Latest supported HBase version

2020-04-20 Thread Josh Elser

You should contact Cloudera for information on how to use Phoenix on CDH.

There is no upstream build of Apache Phoenix 5.x which was built and 
tested against any release of CDH. There is only the downstream work 
which Cloudera packages and provides themselves.


On 4/20/20 11:09 AM, Arun J wrote:


All,

I see that the latest support HBase version is 2.0 from the below 
listing here 


We use CDH6.3.2 which is documented to be HBase 2.14 however it uses 
HBase 2.1.0.  Looking to use Phoenix on this CDH ecosystem but not sure 
whether there is a supported version against this HBase (2.1).   This 
CDH version is the last community version and I am thinking many would 
be on this HBase (2.1) version.


There are some aggregated functions that is supported in HBase 2x and I 
am curious to see whether Phoenix supports and leverages those and if 
there is a supported version on 2.1.  I saw this thread but I couldn't 
get anything concrete finalized.


https://lists.apache.org/thread.html/210c38fad6b42251bfac67d778577e356944dea7f9baa20526d64c76%40%3Cdev.phoenix.apache.org%3E

I also found the release date is July 2018 for HBase 2.0 whereas 
HBase1.5 is Dec 2019.  Is this release date accurate.  I am little 
confused with the  chronology of dates below.  Any release further to 
this on HBase 2.x?


TIA,
JAK

Version Release Date
5.0.0-HBase-2.0 04/jul/2018 

4.15.0-HBase-1.520/dec/2019 

4.15.0-HBase-1.420/dec/2019 



[ANNOUNCE] New VP Apache Phoenix

2020-04-16 Thread Josh Elser
I'm pleased to announce that the ASF board has just approved the 
transition of VP Phoenix from myself to Ankit. As with all things, this 
comes with the approval of the Phoenix PMC.


The ASF defines the responsibilities of the VP to be largely oversight 
and secretarial. That is, a VP should be watching to make sure that the 
project is following all foundation-level obligations and writing the 
quarterly project reports about Phoenix to summarize the happenings. Of 
course, a VP can choose to use this title to help drive movement and 
innovation in the community, as well.


With this VP rotation, the PMC has also implicitly agreed to focus on a 
more regular rotation schedule of the VP role. The current plan is to 
revisit the VP role in another year.


Please join me in congratulating Ankit on this new role and thank him 
for volunteering.


Thank you all for the opportunity to act as VP these last years.

- Josh


Re: Reverse engineer a phoneix table definition

2020-04-14 Thread Josh Elser
That should be there already, but that doesn't help the existing 4.x 
release lines (which, I assume, would be what Mich cares about).


On 4/14/20 11:59 AM, Sukumar Maddineni wrote:
How about a simple idea of redirecting all DDL statements to SYSTEM.LOG 
by default which will be useful for logging+auditing purposes and also 
for recreating table if needed.


--
Sukumar

On Tue, Apr 14, 2020 at 8:49 AM Geoffrey Jacoby <mailto:gjac...@salesforce.com>> wrote:


This is a frequent feature request we unfortunately haven't
implemented yet -- see PHOENIX-4286 and PHOENIX-5054, one of which I
filed and the other one Josh did. :-)

I agree with Josh, I'd love to see an implementation of this if
someone has bandwidth.

Geoffrey Jacoby

On Tue, Apr 14, 2020 at 8:01 AM Josh Elser mailto:els...@apache.org>> wrote:

Yeah, I don't have anything handy.

I'll happily review and commit such a utility if you happen to
write one
(even if flawed).

On 4/12/20 1:31 AM, Simon Mottram wrote:
 > Best I can offer is
 >
 >   "SELECT * FROM SYSTEM.CATALOG where table_name = '" +
tableName + "'
 > and table_schem = '"  +schemaName + "'"
 >
 > S
 >

 > *From:* Mich Talebzadeh mailto:mich.talebza...@gmail.com>>
 > *Sent:* Sunday, 12 April 2020 1:36 AM
 > *To:* user mailto:user@phoenix.apache.org>>
 > *Subject:* Reverse engineer a phoneix table definition
 > Hi,
 >
 > I was wondering if anyone has a handy script to reverse
engineer an
 > existing table schema.
 >
 > I guess one can get the info from system.catalog table to
start with.
 > However, I was wondering if there is a shell script already
or I have to
 > write my own.
 >
 > Thanks,
 >
 > Dr Mich Talebzadeh
 >
 > LinkedIn
 >

/https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw/
 >
 > http://talebzadehmich.wordpress.com
 >
 >
 > *Disclaimer:* Use it at your own risk.Any and all
responsibility for any
 > loss, damage or destruction of data or any other property
which may
 > arise from relying on this email's technical content is
explicitly
 > disclaimed. The author will in no case be liable for any
monetary
 > damages arising from such loss, damage or destruction.
 >



--

<https://smart.salesforce.com/sig/smaddineni//us_mb/default/link.html>


Re: Reverse engineer a phoneix table definition

2020-04-14 Thread Josh Elser

Yeah, I don't have anything handy.

I'll happily review and commit such a utility if you happen to write one 
(even if flawed).


On 4/12/20 1:31 AM, Simon Mottram wrote:

Best I can offer is

  "SELECT * FROM SYSTEM.CATALOG where table_name = '" + tableName + "' 
and table_schem = '"  +schemaName + "'"


S

*From:* Mich Talebzadeh 
*Sent:* Sunday, 12 April 2020 1:36 AM
*To:* user 
*Subject:* Reverse engineer a phoneix table definition
Hi,

I was wondering if anyone has a handy script to reverse engineer an 
existing table schema.


I guess one can get the info from system.catalog table to start with. 
However, I was wondering if there is a shell script already or I have to 
write my own.


Thanks,

Dr Mich Talebzadeh

LinkedIn 
/https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw/


http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk.Any and all responsibility for any 
loss, damage or destruction of data or any other property which may 
arise from relying on this email's technical content is explicitly 
disclaimed. The author will in no case be liable for any monetary 
damages arising from such loss, damage or destruction.




Re: Select * gets 0 rows from index table

2020-03-30 Thread Josh Elser

Hey Reid!

Can you clarify a couple of things?

* What version of Phoenix?
* Did you `select * from index_table` verbatim? Most of the time, when 
you have an index table, you'd be interacting with the data table which 
(behind the scenes) goes to the index table.

* * Caveat about covered columns in a query
* What's the state of the index? Look at the INDEX_STATE column in 
system.catalog for your index table.
* Did you use Phoenix to create the data+index tables and to populate 
the data in those tables?


On 3/30/20 4:35 AM, Reid Chan wrote:

Hi team,

I encountered a problem that select * from index_table limit x got 0 rows, but 
underlying hbase has data (observed it from hbase shell > scan) and any queries 
went to index table would get 0 rows as well.

In the meantime the server had the following error message: 
"index.GlobalIndexChecker: Could not find the newly rebuilt index row with row key 
xxx for table yyy."

Looking forward to get some hints from experienced users and devs.

Thanks!

--

Best regards,
R.C




Re: Phoenix dependency jar for PhoenixDataSource

2020-02-27 Thread Josh Elser
We received your email the first time. Please remember that this is a 
community of volunteers. Do not send multiple emails asking the same 
question.


In the future, you can validate that the list received your message via 
the interface at https://list.apache.org


Thank you.

On 2/27/20 8:38 PM, Raja sekhara reddy wrote:

Hi Phoenix team,


Currently in my project, we are moving data from solr to hbase which it 
should overwrite the existing data. For this , we used saveToPhoenix. 
But this is not overwriting the existing data.



So, I made changes to the script to uses PhoenixDataSource class which I 
found in the Phoenix documents to save a data frame.


Please find the below screenshot .

unknown.jpg

Now I’m order to use PhoenixDataSource I imported 
org.apache.phoenix.spark.datasource.v2. PhoenixDataSource. For this I 
tried to add dependency external jars manually from maven repo.


In repo, I found 
Phoenix-spark-4.10.0-HBase-1.1.jar, Phoenix-spark-5.0.0-HBase-1.1.jar 
and couple of other versions and I tried. But I couldn’t find 
the PhoenixDataSource class in those jars.



Could you please help me, where can I find the jar with this class.. 
Really appreciate your help.



FYI:

I found PhoenixDataSource from below Phoenix docs.


https://phoenix.apache.org/phoenix_spark.html


Best Regards,
Raja Chintala
(813)203-9974


Re: Query on phoenix upgrade to 5.1.0

2020-01-30 Thread Josh Elser
1-30 13:20:57,623 DEBUG 
org.apache.phoenix.shaded.org.apache.zookeeper.ClientCnxn: Closing 
client for session: 0x16fece700a300fb
2020-01-30 13:20:57,623 DEBUG org.apache.phoenix.jdbc.PhoenixDriver: 
Expiring 
hbase-data-upgrade001-stg.foo.bar,hbase-data-upgrade002-stg.foo.bar,hbase-master-upgrade001-stg.foo.bar:2181:/hbase 
because of EXPLICIT
2020-01-30 13:20:57,623 INFO 
org.apache.phoenix.log.QueryLoggerDisruptor: Shutting down 
QueryLoggerDisruptor..
2020-01-30 13:20:57,627 DEBUG 
org.apache.phoenix.shaded.org.eclipse.jetty.server.Server: RESPONSE / 
  500 handled=true


As for Phoenix - I used version built by our devs.
To make sure that everything is right, I want to use the official 
version, but I can't find tarbal on 
https://phoenix.apache.org/download.html 
<https://phoenix.apache.org/download.html> page.


On Wed, Jan 29, 2020 at 6:55 PM Josh Elser <mailto:els...@apache.org>> wrote:


Aleksandr and Prathap,

Upgrades are done in Phoenix as they always have been. You should
deploy
the new version of phoenix-server jars to HBase, and then the first
time
a client connects with the Phoenix JDBC driver, that client will
trigger
an update to any system tables schema.

As such, you need to make sure that this client has permission to alter
the phoenix system tables that exist, often requiring admin-level
access
to hbase. Your first step should be collecting DEBUG log from your
Phoenix JDBC client on upgrade.

Please also remember that 5.0.0 is pretty old at this point -- we're
overdue for a 5.1.0. There may be existing issues that have already
been
fixed around the upgrade. Doing a search on Jira if you've not done so
already is important.

On 1/29/20 4:30 AM, Aleksandr Saraseka wrote:
 > Hello.
 > I'm second on this.
 > We upgraded phoenix from 4.14.0 to 5.0.0 (with all underlying things
 > like hdfs, hbase) and have the same problem.
 >
 > We are using queryserver + thin-client
 > So on PQS side we have:
 > 2020-01-29 09:24:21,579 INFO org.apache.phoenix.util.UpgradeUtil:
 > Upgrading metadata to add parent links for indexes on views
 > 2020-01-29 09:24:21,615 INFO org.apache.phoenix.util.UpgradeUtil:
 > Upgrading metadata to add parent to child links for views
 > 2020-01-29 09:24:21,628 INFO
 > org.apache.hadoop.hbase.client.ConnectionImplementation: Closing
master
 > protocol: MasterService
 > 2020-01-29 09:24:21,631 INFO
 > org.apache.phoenix.log.QueryLoggerDisruptor: Shutting down
 > QueryLoggerDisruptor..
 >
 > On client side:
 > java.lang.RuntimeException:
 > org.apache.phoenix.schema.TableNotFoundException: ERROR 1012
(42M03):
 > Table undefined. tableName=SYSTEM.CHILD_LINK
 >
 > Can you point me to upgrade guide for Phoenix ? I tried to
find it by
 > myself and have no luck.
 >
 > On Thu, Jan 16, 2020 at 1:08 PM Prathap Rajendran
mailto:prathap...@gmail.com>
 > <mailto:prathap...@gmail.com <mailto:prathap...@gmail.com>>> wrote:
 >
 >     Hi All,
 >
 >     Thanks for the quick update. Still we have some clarification
about
 >     the context.
 >
 >     Actually we are upgrading from the below version
 >     Source      : apache-phoenix-4.14.0-cdh5.14.2
 >     Destination: apache-phoenix-5.0.0-HBase-2.0-bin.tar.gz
 >   
  <http://csfci.ih.lucent.com/~prathapr/phoenix62/apache-phoenix-5.0.0-HBase-2.0-bin.tar.gz <http://csfci.ih.lucent.com/~prathapr/phoenix62/apache-phoenix-5.0.0-HBase-2.0-bin.tar.gz>>

 >
 >     Just FYI, we have already upgraded to Hbase  2.0.
 >
 >     Still we are facing the issue below, Once we create this table
 >     manually, then there is no issues to run DML operations.
 >        >     org.apache.hadoop.hbase.TableNotFoundException:
 >     SYSTEM.CHILD_LINK
 >
 >     Please let me know if any steps/documents for phoenix upgrade
from
 >     4.14 to 5.0.
 >
 >     Thanks,
 >     Prathap
 >
 >
 >     On Tue, Jan 14, 2020 at 11:34 PM Josh Elser
mailto:els...@apache.org>
 >     <mailto:els...@apache.org <mailto:els...@apache.org>>> wrote:
 >
 >         (with VP-Phoenix hat on)
 >
 >         This is not an official Apache Phoenix release, nor does it
 >         follow the
 >         ASF trademarks/branding rules. I'll be following up with the
 >         author to
 >         address the trademark violations.
 >
 >         Please direct your questions to the author of this project.
 >         Again, it is
 >         *not* Apache Phoenix.
   

Re: Query on phoenix upgrade to 5.1.0

2020-01-29 Thread Josh Elser

Aleksandr and Prathap,

Upgrades are done in Phoenix as they always have been. You should deploy 
the new version of phoenix-server jars to HBase, and then the first time 
a client connects with the Phoenix JDBC driver, that client will trigger 
an update to any system tables schema.


As such, you need to make sure that this client has permission to alter 
the phoenix system tables that exist, often requiring admin-level access 
to hbase. Your first step should be collecting DEBUG log from your 
Phoenix JDBC client on upgrade.


Please also remember that 5.0.0 is pretty old at this point -- we're 
overdue for a 5.1.0. There may be existing issues that have already been 
fixed around the upgrade. Doing a search on Jira if you've not done so 
already is important.


On 1/29/20 4:30 AM, Aleksandr Saraseka wrote:

Hello.
I'm second on this.
We upgraded phoenix from 4.14.0 to 5.0.0 (with all underlying things 
like hdfs, hbase) and have the same problem.


We are using queryserver + thin-client
So on PQS side we have:
2020-01-29 09:24:21,579 INFO org.apache.phoenix.util.UpgradeUtil: 
Upgrading metadata to add parent links for indexes on views
2020-01-29 09:24:21,615 INFO org.apache.phoenix.util.UpgradeUtil: 
Upgrading metadata to add parent to child links for views
2020-01-29 09:24:21,628 INFO 
org.apache.hadoop.hbase.client.ConnectionImplementation: Closing master 
protocol: MasterService
2020-01-29 09:24:21,631 INFO 
org.apache.phoenix.log.QueryLoggerDisruptor: Shutting down 
QueryLoggerDisruptor..


On client side:
java.lang.RuntimeException: 
org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): 
Table undefined. tableName=SYSTEM.CHILD_LINK


Can you point me to upgrade guide for Phoenix ? I tried to find it by 
myself and have no luck.


On Thu, Jan 16, 2020 at 1:08 PM Prathap Rajendran <mailto:prathap...@gmail.com>> wrote:


Hi All,

Thanks for the quick update. Still we have some clarification about
the context.

Actually we are upgrading from the below version
Source      : apache-phoenix-4.14.0-cdh5.14.2
Destination: apache-phoenix-5.0.0-HBase-2.0-bin.tar.gz

<http://csfci.ih.lucent.com/~prathapr/phoenix62/apache-phoenix-5.0.0-HBase-2.0-bin.tar.gz>

Just FYI, we have already upgraded to Hbase  2.0.

Still we are facing the issue below, Once we create this table
manually, then there is no issues to run DML operations.
   >     org.apache.hadoop.hbase.TableNotFoundException:
SYSTEM.CHILD_LINK

Please let me know if any steps/documents for phoenix upgrade from
4.14 to 5.0.

Thanks,
Prathap


On Tue, Jan 14, 2020 at 11:34 PM Josh Elser mailto:els...@apache.org>> wrote:

(with VP-Phoenix hat on)

This is not an official Apache Phoenix release, nor does it
follow the
ASF trademarks/branding rules. I'll be following up with the
author to
address the trademark violations.

Please direct your questions to the author of this project.
Again, it is
*not* Apache Phoenix.

On 1/14/20 12:37 PM, Geoffrey Jacoby wrote:
 > Phoenix 5.1 doesn't actually exist yet, at least not at the
Apache
 > level. We haven't released it yet. It's possible that a
vendor or user
 > has cut an unofficial release off one of our
development branches, but
 > that's not something we can give support on. You should
contact your
 > vendor.
 >
 > Also, since I see you're upgrading from Phoenix 4.14 to 5.1:
The 4.x
 > branch of Phoenix is for HBase 1.x systems, and the 5.x
branch is for
 > HBase 2.x systems. If you're upgrading from a 4.x to a 5.x,
make sure
 > that you also upgrade your HBase. If you're still on HBase
1.x, we
 > recently released Phoenix 4.15, which does have a supported
upgrade path
 > from 4.14 (and a very similar set of features to what 5.1 will
 > eventually get).
 >
 > Geoffrey
 >
 > On Tue, Jan 14, 2020 at 5:23 AM Prathap Rajendran
mailto:prathap...@gmail.com>
 > <mailto:prathap...@gmail.com <mailto:prathap...@gmail.com>>>
wrote:
 >
 >     Hello All,
 >
 >     We are trying to upgrade the phoenix version from
 >     "apache-phoenix-4.14.0-cdh5.14.2" to
"APACHE_PHOENIX-5.1.0-cdh6.1.0."
 >
 >     I couldn't find out any upgrade steps for the same.
Please help me
 >     out to get any documents available.
 >     *_Note:_*
 >     I have downloaded the below phoenix parcel and trying to
access some
 >     DML operation. I am getting the following error
 >
 

Re: Index table empty

2020-01-27 Thread Josh Elser

Hi Tim,

It sounds like you're doing the right steps to build an index with the 
async approach. Not having records after IndexTool runs successfully is 
definitely unexpected :)


If you aren't getting any records in the index table after running the 
IndexTool, my guess is that something is going awry there. Have you 
looked at the logging produced by that MapReduce job? (e.g. `yarn logs 
-applicationId `).


It's curious that the job runs successfully and sets the index state to 
active, but you don't have any data loaded.


As a sanity check, does `select * from meta_reads limit 10` return you 
expected data?


For the future, always a good idea to let us know what version of 
Hadoop/HBase/Phoenix you're running when asking questions.


PS: If you're worried about the SocketTimeException, it's probably more 
to do with the size of your data table and the way a `select count(*)` 
runs. This is a full-table scan, and you'd have to increase 
hbase.rpc.timeout at at minimum to a larger value. If this is a normal 
query pattern you intend to service, it will be an exercise in tweaking 
configs.


On 1/27/20 6:01 PM, Tim Dolbeare wrote:

Hello All,

I've run into a problem with a Phoenix index that no amount of googling 
is solving.  I hope someone might have run into this before and can 
offer some suggestions.  I'm a noob BTW, so please don't hesitate to 
point out the most obvious potential issues.  The problem is that after 
indexing a table already populated with 1M rows a) any query that uses 
the new index returns 0 results and b) the index table itself is empty.


I have created a table via psql.py, populated it with 1M rows via 
CsvBulkLoadTool, created an async covered index on that table in 
sqlline.py, followed by a mapreduce index population with IndexTool.  
All of that completes without error, and the index is marked "ACTIVE".


Here are my table and index definitions:

DROP TABLE IF EXISTS meta_reads;
CREATE IMMUTABLE TABLE IF NOT EXISTS meta_reads (
       cluster VARCHAR,
       subclass VARCHAR,
       class VARCHAR,
       sex VARCHAR,
       region VARCHAR,
       subregion VARCHAR,
       cell VARCHAR NOT NULL,
       gene VARCHAR NOT NULL,
       read FLOAT,
       CONSTRAINT my_pk PRIMARY KEY (cell, gene))
IMMUTABLE_STORAGE_SCHEME = ONE_CELL_PER_COLUMN;

create index idx_gc on meta_reads(gene, cluster) include(read) ASYNC;


Almost any query that attempts to use the index returns 0 results, 
however 'select count(*) from meta_reads' throws a SocketTimeoutException.



Any ideas?

Thanks

Tim







Re: Query on phoenix upgrade to 5.1.0

2020-01-14 Thread Josh Elser

(with VP-Phoenix hat on)

This is not an official Apache Phoenix release, nor does it follow the 
ASF trademarks/branding rules. I'll be following up with the author to 
address the trademark violations.


Please direct your questions to the author of this project. Again, it is 
*not* Apache Phoenix.


On 1/14/20 12:37 PM, Geoffrey Jacoby wrote:
Phoenix 5.1 doesn't actually exist yet, at least not at the Apache 
level. We haven't released it yet. It's possible that a vendor or user 
has cut an unofficial release off one of our development branches, but 
that's not something we can give support on. You should contact your 
vendor.


Also, since I see you're upgrading from Phoenix 4.14 to 5.1: The 4.x 
branch of Phoenix is for HBase 1.x systems, and the 5.x branch is for 
HBase 2.x systems. If you're upgrading from a 4.x to a 5.x, make sure 
that you also upgrade your HBase. If you're still on HBase 1.x, we 
recently released Phoenix 4.15, which does have a supported upgrade path 
from 4.14 (and a very similar set of features to what 5.1 will 
eventually get).


Geoffrey

On Tue, Jan 14, 2020 at 5:23 AM Prathap Rajendran > wrote:


Hello All,

We are trying to upgrade the phoenix version from
"apache-phoenix-4.14.0-cdh5.14.2" to "APACHE_PHOENIX-5.1.0-cdh6.1.0."

I couldn't find out any upgrade steps for the same. Please help me
out to get any documents available.
*_Note:_*
I have downloaded the below phoenix parcel and trying to access some
DML operation. I am getting the following error


https://github.com/dmilan77/cloudera-phoenix/releases/download/5.1.0-HBase-2.0-cdh6.1.0/APACHE_PHOENIX-5.1.0-cdh6.1.0.p1.0-el7.parcel



*_Error:_*
20/01/13 04:22:41 WARN client.HTable: Error calling coprocessor
service
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService
for row \x00\x00WEB_STAT
java.util.concurrent.ExecutionException:
org.apache.hadoop.hbase.TableNotFoundException:
org.apache.hadoop.hbase.TableNotFoundException: SYSTEM.CHILD_LINK
         at

org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:860)
         at

org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:755)
         at

org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:137)
         at

org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:326)
         at

org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:153)
         at

org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58)
         at

org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
         at
org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:267)
         at

org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:435)
         at

org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:310)
         at
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:595)
         at

org.apache.phoenix.coprocessor.ViewFinder.findRelatedViews(ViewFinder.java:94)
         at

org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropChildViews(MetaDataEndpointImpl.java:2488)
         at

org.apache.phoenix.coprocessor.MetaDataEndpointImpl.createTable(MetaDataEndpointImpl.java:2083)
         at

org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17053)
         at
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8218)
         at

org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2423)
         at

org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2405)
         at

org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42010)*_
_*

Thanks,
Prathap



Re: [ANNOUNCE] Apache Phoenix 4.15.0 released

2020-01-11 Thread Josh Elser
Responding with my VP Phoenix hat, as this worries me with blurred-lines 
of what is "Vendor" and "Apache".


First, Cloudera/Hortonworks and any other company who redistributes some 
version of Apache Phoenix (henceforth, called a "vendor") do not do so 
as official Apache Phoenix releases. Vendors do not have the ability to 
create Apache Phoenix releases. They can only create releases based on 
some Apache Phoenix release. Only the Phoenix PMC holds the power to 
create Apache Phoenix releases.


The "Phoenix for CDH" releases created in the past were done as Apache 
Phoenix community efforts. Specifically, committers and PMC members took 
the time to personally ensure that releases of Apache Phoenix work 
against specific CDH versions. These are volunteer efforts (as all 
Apache efforts are), not corporate efforts.


The future roadmap of what CDH releases (or any other Apache 
Hadoop/HBase "compatible" release) by the Phoenix PMC is a wholly 
separate process from what any vendor creates as some derivative of 
Apache Phoenix.


(Aside: sometimes, vendors employ people who work on Apache projects, 
but they are held to a high standard to act within specific guidelines 
to prevent these lines from being blurred.)


Finally, again, these Phoenix releases built against CDH done by the 
community are volunteer-driven. We would be very happy to have more 
people volunteering time to donate their time and efforts to creating 
more Apache Phoenix releases which we can provide to our users. 
Contacting the d...@phoenix.apache.org list is the best way to get started.


Sorry this was so heavy-handed -- this is a very subtle but important 
distinction that Apache projects need to always be very clear about.


- Josh

On 1/10/20 3:11 PM, chinogitano wrote:

Hi Chinmay:

Good to hear of the progress.
Will the project News, Recent Improvements, and Roadmap pages be updated
soon?

Also, will CDH versions be released for 4.15?  As you know, the
Cloudera/Hortonworks landscape are going through massive overhauls right
now, with corporate announcing renewed support for Phoenix.  However, very
little detail about licensing terms, roadmap, or relationship with the
Apache parent project is released.  Meanwhile, the last CDH release was 1.5
years ago.  Would love some clarity on the future of this project from the
core commiters.

Many thanks,
Miles Yao





--
Sent from: http://apache-phoenix-user-list.1124778.n5.nabble.com/



Re: FAQ page is blank now

2020-01-09 Thread Josh Elser

Indeed it is. Looks like a dev had a bad commit to the website. Fixing.

Thanks for letting us know.

On 1/9/20 10:42 AM, Alexander Batyrshin wrote:
  Looks like FAQ page at http://phoenix.apache.org/faq.html 
 is blank now


Re: Apache Phoenix website search seems to be broken

2019-12-13 Thread Josh Elser
I'm not sure who actually runs search-hadoop. I don't believe it's 
anyone affiliated with Apache Phoenix. Thanks for letting us know it's 
broken, Anil!


I'll drop it from the website for now. Maybe we can find something akin 
to a `site:phoenix.apache.org ` Google search that we can embed?


On 12/13/19 4:42 PM, anil gupta wrote:

Hi,

When i try to use the search feature on https://phoenix.apache.org/ 
 it takes me to: 
http://www1.search-hadoop.com/?subid4=1576273131.0028120806 
 and there 
are no results. Is this a temporary error or search-hadoop website is gone.


--
Thanks & Regards,
Anil Gupta


Re: Issue using ANY ARRAY feature

2019-12-09 Thread Josh Elser

Hi Simon,

Thanks for replying back with your fix. We appreciate when folks do this 
so that others can also see the solution.


http://phoenix.apache.org/download.html only publishes the "latest" 
release for a line that we're maintaining. That's why you'll see 4.14.3 
listed on the website, not 4.14.0/1/2.


If you see a release x.y.z, you can reasonably assume[1] that you'll 
also find release x.y.z' where z'=[0,z). We expect that compatibility is 
maintained in the bugfix releases to some line, so there is no reason to 
not update to the latest version.


- Josh

[1] The one caveat here is that if there is a security-issues, we may 
explicitly pull a release from being downloaded.


On 12/8/19 6:12 PM, Simon Mottram wrote:

Update:

Just in case anyone hits this issue in future with the AWS managed
HBase, the fix is to use a very specific version of the driver

For thick client:

 
 org.apache.phoenix
 phoenix-core
 4.14.1-HBase-1.4
 

For thin client:

 
 org.apache.phoenix
 phoenix-queryserver-client
 4.14.1-HBase-1.4
 

This required support help from AWS as this driver version is not
mentioned on the official Apache Phoenix download page

Regards

Simon


On Sun, 2019-11-17 at 21:03 +, Simon Mottram wrote:

Phoenix Version: 4.14.2-HBase-1.4
HBase Version: AWS EMR
Release
label:emr-5.24.1
Hadoop distribution:Amazon
Applications:Phoenix 4.14.1,
Hue 4.4.0, HBase 1.4.9


Having an issue where ANY(ARRAY[]) stops the query returning any
results when used in a 'AND' conjunction

e.g (fielda = 'a') AND (fieldb = any(array['a','b','c]))

Always returns zero results, if i change to disjunction (OR) it works
fine but obviously isn't what's wanted here.  Excuse the long post
but
I wanted to be as clear as possible.

It's quite possible I have misunderstood the way ANY() works but...

Here's a simple query that returns a correct number of results:

SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
  FROM DEV_OAPI.OBSERVATION
WHERE (BIOMATERIALNAME = 'ROOT00386') AND (TRIALNAME =
'TRIAL00015')
  ORDER BY OBSERVATIONDATE DESC
  LIMIT 10
  OFFSET 0

So there's definitely records where biomaterial name and trialname
have
these values

If I change it to

SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
  FROM DEV_OAPI.OBSERVATION
WHERE (TRIALNAME = ANY(ARRAY['TRIAL00015', 'SOMETHING ELSE']))
  ORDER BY OBSERVATIONDATE DESC
  LIMIT 10
  OFFSET 0

I get valid results

So the ANY(ARRAY[]) function works

Here's the explain which looks very odd to more, but it works

PLAN
   
  
   
CLIENT 1-CHUNK 50574 ROWS 314572800 BYTES PARALLEL 1-WAY FULL SCAN

OVER
DEV_OAPI.OBSERVATION
   
  
   
SERVER FILTER BY

org.apache.phoenix.expression.function.ArrayAnyComparisonExpression
[children=[ARRAY['TRIAL00015'], TRIALNAME =
org.apache.phoenix.expression.function.ArrayElemRefExpression
[children=[ARRAY['TRIAL00015'], 1
 SERVER TOP 10 ROWS SORTED BY [OBSERVATIONDATE
DESC]
  C
CLIENT MERGE
SORT
   
CLIENT LIMIT 10


So far so good, BUT.

However if I combine the ARRAY expression with any expression using
AND
I get zero results, even tho as above both sides of the conjunction
return true.

e.g.

SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
  FROM DEV_OAPI.OBSERVATION
WHERE (BIOMATERIALNAME = 'ROOT00386') AND (TRIALNAME =
ANY(ARRAY['TRIAL00015', 'SOMETHING ELSE']))
  ORDER BY OBSERVATIONDATE DESC
  LIMIT 10
  OFFSET 0

Explain (newlines added):
  SERVER FILTER BY
  (BIOMATERIALNAME = 'SCION00424'
  AND
  org.apache.phoenix.expression.function.ArrayAnyComparisonExpression
[children=[ARRAY['TRIAL00015','SOMETHING ELSE'],
  TRIALNAME =
org.apache.phoenix.expression.function.ArrayElemRefExpression
[children=[ARRAY['TRIAL00015','SOMETHING ELSE'], 1)

Just out of interest I tried with strings only in the array check

SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
  FROM DEV_OAPI.OBSERVATION
WHERE (BIOMATERIALNAME = 'ROOT00386') AND ('TRIAL00015' =
ANY(ARRAY['TRIAL00015']))
  ORDER BY OBSERVATIONDATE DESC
  LIMIT 10
  OFFSET 0

This works fine (in a kind of unhelpful way)

I have tested using the thick client:
 
 
 org.apache.phoenix
 phoenix-core
 4.14.2-HBase-1.4
 

and the thin client

  
 org.apache.phoenix
 phoenix-queryserver-client
 4.14.2-HBase-1.4
 

I've tried using braces and re-ordering but any query of the form:
 AND field = ANY(ARRAY['value1'...])
Returns zero results regardless of values

We can't change version as we are using the Amazon AWS EMR managed
stack and no other phoenix libraries work.

Thanks for taking the time to read this far!

Cheers

Simon


New committer: Istvan Toth

2019-12-03 Thread Josh Elser

Everyone,

On behalf of the PMC, I'm pleased to announce Apache Phoenix's newest 
committer: Istvan Toth.


Please join me in extending a warm welcome to Istvan -- congratulations 
on this recognition. Thank you for your contributions and we all look 
forward to more involvement in the future!


- Josh


Re: Phoenix spark java integration example

2019-11-26 Thread Josh Elser

That would be great!

On 11/25/19 6:43 PM, Karthigeyan r wrote:
Hi Josh , Thanks for replying me back . At this moment I am developing 
an integration using java. But as you know it is being thought without 
any additional sample . Once completed I can upload the source base for 
other to gain some insights .


On Mon, 25 Nov 2019 at 8:47 AM, Josh Elser <mailto:josh.el...@gmail.com>> wrote:


Hi Karthigeyan,

I'm moving your message over to the users list in the hopes that a more
broad audience might see your ask. The dev list is focused around the
day-to-day development of Apache Phoenix.

I don't believe I've seen any examples of the phoenix-spark module in
Java. If you happen to find some, it would be good to have at least one
example available in our codebase.

On 11/22/19 6:13 PM, Karthigeyan r wrote:
 > Dear Team,
 >
 >
 >
 > I am just wondering phoenix as developed on java, I tried to
develop an
 > application with Phoenix-spark plugin. While searching for
examples, I am
 > not seeing extensive example in java rather I find most of it in
scala or
 > python. Do we have detailed example set of various Phoenix-spark
 > integration in java ?
 >
 >
 >
 >
 >
 > Karthigeyan
 >



Re: Phoenix spark java integration example

2019-11-25 Thread Josh Elser

Hi Karthigeyan,

I'm moving your message over to the users list in the hopes that a more 
broad audience might see your ask. The dev list is focused around the 
day-to-day development of Apache Phoenix.


I don't believe I've seen any examples of the phoenix-spark module in 
Java. If you happen to find some, it would be good to have at least one 
example available in our codebase.


On 11/22/19 6:13 PM, Karthigeyan r wrote:

Dear Team,



I am just wondering phoenix as developed on java, I tried to develop an
application with Phoenix-spark plugin. While searching for examples, I am
not seeing extensive example in java rather I find most of it in scala or
python. Do we have detailed example set of various Phoenix-spark
integration in java ?





Karthigeyan



Re: When Phoenix 5.x for HBase-2.x will be updated?

2019-11-04 Thread Josh Elser
ASF policy states that we have to talk about unreleased code on the dev 
list, not the user list. Moving this over to the dev list as such :)


On 11/4/19 8:41 AM, Alexander Batyrshin wrote:

As I see there are many bug fixes and updates (consistent indexes) for 4.x 
Phoenix branch.
So im curios when this will be available for 5.x branch?



Re: Need help with PHONEIX 5

2019-10-27 Thread Josh Elser
Venkat,

First, it is poor etiquette to specifically send project members emails.
Please refrain from doing this in the future for any project at the ASF
unless you have explicit approval from that person.

Second, there is no official release of Phoenix that advertises support
against any HBase 2.1.x release. As such, I can only imagine that you are
building from the master branch and deploying this on your own. If this is
the case, please move your discussion to the developers list, as ASF
guidelines expressly prohibit discussions on non-releases software anywhere
except the dev list. Please confirm what exactly you are doing before we
proceed further.

On Fri, Oct 25, 2019 at 10:16 PM Venkat Chunduru 
wrote:

> Hello  ,
> I am working on moving our existing  Phoenix to latest version of Phoenix
> 5 with Hbase 2.1 from current version of Phoenix 4.13 hbase 1.2  and
> getting compatibility issues and seeking some help on how to migrate to
> current version of phoenix as i am under time crunch hoping to get help ,
> thank you for your help in advance.
> Here is the error i am getting :
> Error: ERROR 504 (42703): Undefined column.
> columnName=SYSTEM.CATALOG.TRANSACTION_PROVIDER (state=42703,code=504)
> org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703):
> Undefined column. columnName=SYSTEM.CATALOG.TRANSACTION_PROVIDER at
> org.apache.phoenix.schema.PTableImpl.getColumnForColumnName(PTableImpl.java:828)
> at
> org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.resolveColumn(FromCompiler.java:482)
> at
> org.apache.phoenix.compile.UpsertCompiler.compile(UpsertCompiler.java:452)
> at
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableUpsertStatement.compilePlan(PhoenixStatement.java:784)
> at
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableUpsertStatement.compilePlan(PhoenixStatement.java:770)
> at
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:401)
> at
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:391)
>
> Regards,
> Venkat Chunduru
>
> *Big Data Lead Engineer *
>
>
> W:  collectivei.com
>
> T:   @collectivei
>
>
> New York City · Silicon Valley · Montreal
>
>
>
> Collective[i] (short for Collective Intelligence) dramatically improves
> sales and marketing performance using technology, applications and a
> revolutionary network designed to help our clients better understand their
> buyers and grow revenue. Our goal is to maximize human potential and
> minimize mistakes. In most cases, the results are astounding. We cannot,
> however, stop emails from sometimes being sent to the wrong person. If you
> are not the intended recipient, please notify us by replying to this
> email's sender and deleting it (and any attachments) permanently from your
> system. If you are meant to receive this email, please respect the
> confidentiality of this communication's contents.
>
> The views and opinions included in this email belong to the author and do
> not necessarily reflect the views and opinions of Collective[i]. Our
> employees are obliged not to make any defamatory statements or otherwise
> infringe on another party’s legal rights. In the event of any damages or
> other liabilities arising from such statements, our employees shall be
> fully and solely personally responsible for the content of their emails.
>
> Finally, transparency is a core value at Collective[i]. In order to
> provide you with better service and/or to improve our internal operations,
> we may share your information with third parties such as Google, LinkedIn
> and others. We make certain client information available to third parties
> (a) in order to comply with various reporting obligations; (b) for
> Collective[i]’s business or marketing purposes; and (c) to better
> understand our current and future clients. For more information, please see
> our Privacy Policy .
>


Re: Index on SYSTEM.LOG failed

2019-10-25 Thread Josh Elser
My question was not meant to imply that creating any index should fail 
in the same manner as what you see here.


https://docs.oracle.com/javase/8/docs/api/java/lang/IllegalAccessError.html

An IllegalAccessError means that the PQS tried to access this class but 
it failed (for some reason). Of course, the thin client is not going to 
be trying to access code that it can't.


It could be that there is an issue in creating a local index against the 
system.log table, but this real error is being lost because of your 
classpath issue. Can't really say anything definitively with the 
information you've provided.


On 10/25/19 8:21 AM, Aleksandr Saraseka wrote:

Hello.
Indexes on other tables created without any problems.
0: jdbc:phoenix:thin:url=http://localhost:876> create local index 
alex_test_data_idx on alex.test (data);

No rows affected (10.93 seconds)
0: jdbc:phoenix:thin:url=http://localhost:876>

On Thu, Oct 24, 2019 at 8:18 PM Josh Elser <mailto:els...@apache.org>> wrote:


Do you have a mismatch of Phoenix thinclient jars and Phoenix
QueryServer versions?

You're getting a classpath-type error, not some Phoenix internal error.

On 10/24/19 10:01 AM, Aleksandr Saraseka wrote:
 > Hello. We're logging queries in Phoenix.
 > Main criteria can be a start_time (to investigate possible
performance
 > problems in some particular time).
 > Execution plan for the query shows full scan - that could cause
problems
 > with a lot of data^
 > explain select query from system.LOG order by start_time;
 >

++-++--+
 > |                            PLAN                            |
 > EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
 >

++-++--+
 > | CLIENT 32-CHUNK PARALLEL 32-WAY FULL SCAN OVER SYSTEM:LOG  | null
 >         | null           | null         |
 > | CLIENT MERGE SORT                                          | null
 >         | null           | null         |
 >

++-++--+
 > 2 rows selected
 >
 > So I'm trying to create local index on start_time field, but
getting an
 > exception. Is this "by design" that you can not create index on
SYSTEM
 > tables or I need to do this in some another way ?
 > CREATE LOCAL INDEX "system_log_start_time_idx" ON SYSTEM.LOG
("START_TIME");
 > Error: Error -1 (0) : Error while executing SQL "CREATE LOCAL
INDEX
 > "system_log_start_time_idx" ON SYSTEM.LOG ("START_TIME")": Remote
driver
 > error: IndexOutOfBoundsException: Index: 0 (state=0,code=-1)
 > org.apache.calcite.avatica.AvaticaSqlException: Error -1 (0)
: Error
 > while executing SQL "CREATE LOCAL INDEX
"system_log_start_time_idx" ON
 > SYSTEM.LOG ("START_TIME")": Remote driver error:
 > IndexOutOfBoundsException: Index: 0
 >          at
 > org.apache.phoenix.shaded.org

<http://org.apache.phoenix.shaded.org>.apache.calcite.avatica.Helper.createException(Helper.java:54)
 >          at
 > org.apache.phoenix.shaded.org

<http://org.apache.phoenix.shaded.org>.apache.calcite.avatica.Helper.createException(Helper.java:41)
 >          at
 > org.apache.phoenix.shaded.org

<http://org.apache.phoenix.shaded.org>.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163)
 >          at
 > org.apache.phoenix.shaded.org

<http://org.apache.phoenix.shaded.org>.apache.calcite.avatica.AvaticaStatement.execute(AvaticaStatement.java:217)
 >          at sqlline.Commands.execute(Commands.java:822)
 >          at sqlline.Commands.sql(Commands.java:732)
 >          at sqlline.SqlLine.dispatch(SqlLine.java:813)
 >          at sqlline.SqlLine.begin(SqlLine.java:686)
 >          at sqlline.SqlLine.start(SqlLine.java:398)
 >          at sqlline.SqlLine.main(SqlLine.java:291)
 >          at
 >

org.apache.phoenix.queryserver.client.SqllineWrapper.main(SqllineWrapper.java:93)
 > java.lang.IllegalAccessError:
 >

org/apache/phoenix/shaded/org/apache/calcite/avatica/AvaticaSqlException$PrintStreamOrWriter
 >          at
 >

org.apache.calcite.avatica.AvaticaSqlException.printStackTrace(AvaticaSqlException.java:75)
 >          at java.lang.Throwable.printStackTrace(Throwable.java:634)
 >          at sqlline.SqlLine.handleSQ

Re: Index on SYSTEM.LOG failed

2019-10-24 Thread Josh Elser
Do you have a mismatch of Phoenix thinclient jars and Phoenix 
QueryServer versions?


You're getting a classpath-type error, not some Phoenix internal error.

On 10/24/19 10:01 AM, Aleksandr Saraseka wrote:

Hello. We're logging queries in Phoenix.
Main criteria can be a start_time (to investigate possible performance 
problems in some particular time).
Execution plan for the query shows full scan - that could cause problems 
with a lot of data^

explain select query from system.LOG order by start_time;
++-++--+
|                            PLAN                            | 
EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |

++-++--+
| CLIENT 32-CHUNK PARALLEL 32-WAY FULL SCAN OVER SYSTEM:LOG  | null 
        | null           | null         |
| CLIENT MERGE SORT                                          | null 
        | null           | null         |

++-++--+
2 rows selected

So I'm trying to create local index on start_time field, but getting an 
exception. Is this "by design" that you can not create index on SYSTEM 
tables or I need to do this in some another way ?

CREATE LOCAL INDEX "system_log_start_time_idx" ON SYSTEM.LOG ("START_TIME");
Error: Error -1 (0) : Error while executing SQL "CREATE LOCAL INDEX 
"system_log_start_time_idx" ON SYSTEM.LOG ("START_TIME")": Remote driver 
error: IndexOutOfBoundsException: Index: 0 (state=0,code=-1)
org.apache.calcite.avatica.AvaticaSqlException: Error -1 (0) : Error 
while executing SQL "CREATE LOCAL INDEX "system_log_start_time_idx" ON 
SYSTEM.LOG ("START_TIME")": Remote driver error: 
IndexOutOfBoundsException: Index: 0
         at 
org.apache.phoenix.shaded.org.apache.calcite.avatica.Helper.createException(Helper.java:54)
         at 
org.apache.phoenix.shaded.org.apache.calcite.avatica.Helper.createException(Helper.java:41)
         at 
org.apache.phoenix.shaded.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163)
         at 
org.apache.phoenix.shaded.org.apache.calcite.avatica.AvaticaStatement.execute(AvaticaStatement.java:217)

         at sqlline.Commands.execute(Commands.java:822)
         at sqlline.Commands.sql(Commands.java:732)
         at sqlline.SqlLine.dispatch(SqlLine.java:813)
         at sqlline.SqlLine.begin(SqlLine.java:686)
         at sqlline.SqlLine.start(SqlLine.java:398)
         at sqlline.SqlLine.main(SqlLine.java:291)
         at 
org.apache.phoenix.queryserver.client.SqllineWrapper.main(SqllineWrapper.java:93)
java.lang.IllegalAccessError: 
org/apache/phoenix/shaded/org/apache/calcite/avatica/AvaticaSqlException$PrintStreamOrWriter
         at 
org.apache.calcite.avatica.AvaticaSqlException.printStackTrace(AvaticaSqlException.java:75)

         at java.lang.Throwable.printStackTrace(Throwable.java:634)
         at sqlline.SqlLine.handleSQLException(SqlLine.java:1540)
         at sqlline.SqlLine.handleException(SqlLine.java:1505)
         at sqlline.SqlLine.error(SqlLine.java:905)
         at sqlline.Commands.execute(Commands.java:860)
         at sqlline.Commands.sql(Commands.java:732)
         at sqlline.SqlLine.dispatch(SqlLine.java:813)
         at sqlline.SqlLine.begin(SqlLine.java:686)
         at sqlline.SqlLine.start(SqlLine.java:398)
         at sqlline.SqlLine.main(SqlLine.java:291)
         at 
org.apache.phoenix.queryserver.client.SqllineWrapper.main(SqllineWrapper.java:93)



--
Aleksandr Saraseka
DBA
380997600401
 *•* asaras...@eztexting.com 
 *•* eztexting.com 
 



 
 
 
 
 
 





Re: Sequence number

2019-10-22 Thread Josh Elser
Are you saying that you didn't restart the Phoenix QueryServer after you 
restored the Phoenix system tables?


And then, after running into issues, you restarted PQS and then it 
worked as expected?


I can respect that we probably don't say this anywhere, but you should 
definitely be restarting any Phoenix clients (including PQS) if you are 
wiping the Phoenix system tables.


On 10/22/19 5:08 PM, jesse wrote:

It is properly restored, we double checked.

We worked around the issue by restarting the query server.

But it seems a bad bug.





On Tue, Oct 22, 2019, 11:34 AM Thomas D'Silva > wrote:


Are you sure SYSTEM.SEQUENCE was restored properly? What is the
current value of the sequence in the restored table?

On Fri, Oct 4, 2019 at 1:52 PM jesse mailto:chat2je...@gmail.com>> wrote:

Let's say there is a running cluster A, with table:books and
system.sequence current value 5000, cache size 100, incremental
is 1, the latest book with sequence id:4800

Now the cluster A snapshot is backed up & restored into cluster
b, system.sequence and books table are properly restored, when
we add a new book, the book gets sequence id: 12, why it is not
4801 or 5001?

Our Phoenix version : 4.14.2

Thanks





Re: Materialized views in Hbase/Phoenix

2019-09-30 Thread Josh Elser
Bulk loading would help a little bit in the "all-or-nothing" problem, 
but still not be fool proof.


You could have a set of files which are destined to different tables and 
have very clear data that needs to be loaded, but, if a file(s) failed 
to be loaded, you would have to take some steps to keep retrying.


On 9/27/19 12:22 PM, Gautham Acharya wrote:
We will be reaching 100million rows early next year, and then billions 
shortly after that. So, Hbase will be needed to scale to that degree.


If one of the tables fails to write, we need some kind of a rollback 
mechanism, which is why I was considering a transaction. We cannot be in 
a partial state where some of the ‘views’ are written and some aren’t.


*From:*Pedro Boado [mailto:pedro.bo...@gmail.com]
*Sent:* Friday, September 27, 2019 7:22 AM
*To:* user@phoenix.apache.org
*Subject:* Re: Materialized views in Hbase/Phoenix

*CAUTION:*This email originated from outside the Allen Institute. Please 
do not click links or open attachments unless you've validated the 
sender and know the content is safe.




For just a few million rows I would go for a RDBMS and not Phoenix / HBase.

You don't really need transactions to control completion, just write a 
flag (a COMPLETED empty file, for instance) as a final step in your job.


On Fri, 27 Sep 2019, 15:03 Gautham Acharya, > wrote:


Thanks Anil.

So, what you’re essentially advocating for is to use some kind of
Spark/compute framework (I was going to use AWS Glue) job to write
the ‘materialized views’ as separate tables (maybe tied together
with some kind of a naming convention?)

In this case, we’d end up with some sticky data consistency issues
if the write job failed halfway through (some ‘materialized view’
tables would be updated, and some wouldn’t). Can I use Phoenix
transactions to wrap the write jobs together, to make sure either
all the data is updated, or none?

--gautham

*From:*anil gupta [mailto:anilgupt...@gmail.com
]
*Sent:* Friday, September 27, 2019 6:58 AM
*To:* user@phoenix.apache.org 
*Subject:* Re: Materialized views in Hbase/Phoenix

*CAUTION:*This email originated from outside the Allen Institute.
Please do not click links or open attachments unless you've
validated the sender and know the content is safe.



For your use case, i would suggest to create another table that
stores the matrix. Since this data doesnt change that often, maybe
you can write a nightly spark/MR job to update/rebuild the matrix
table.(If you want near real time that is also possible with any
streaming system) Have you looked into bloom filters? It might help
if you have sparse dataset and you are using Phoenix dynamic columns.
We use dynamic columns for a table that has columns upto 40k. Here
is the presentation and optimizations we made for that use case:
https://www.slideshare.net/anilgupta84/phoenix-con2017-truecarfinal



IMO, Hive integration with HBase is not fully baked and it has a lot
of rough edges. So, it better to stick with native Phoenix/HBase if
you care about performance and ease of operations.

HTH,

Anil Gupta

On Wed, Sep 25, 2019 at 10:01 AM Gautham Acharya
mailto:gauth...@alleninstitute.org>>
wrote:

Hi,

Currently I'm using Hbase to store large, sparse matrices of
50,000 columns 10+ million rows of integers.

This matrix is used for fast, random access - we need to be able
to fetch random row/column subsets, as well as entire columns.
We also want to very quickly fetch aggregates (Mean, median,
etc) on this matrix.

The data does not change very often for these matrices (a few
times a week at most), so pre-computing is very feasible here.
What I would like to do is maintain a column store (store the
column names as row keys, and a compressed list of all the row
values) for the use case where we select an entire column.
Additionally, I would like to maintain a separate table for each
precomputed aggregate (median table, mean table, etc).

The query time for all these use cases needs to be low latency -
under 100ms.

When the data does change for a certain matrix, it would be nice
to easily update the optimized table. Ideally, I would like the
column 

Re: Performance degradation on query analysis

2019-09-24 Thread Josh Elser
Did you change your configuration to prevent compactions from regularly 
happening, Stepan?


By default, you should have a major compaction run weekly which would 
have fixed this for you, although minor compactions would have run 
automatically as well to rewrite small hfiles as you are creating new 
one (generating new stats).


On 9/19/19 4:50 PM, Ankit Singhal wrote:

Please schedule compaction on SYSTEM.STATS table to clear the old entries.

On Thu, Sep 19, 2019 at 1:48 PM Stepan Migunov 
<mailto:stepan.migu...@firstlinesoftware.com>> wrote:


Thanks, Josh. The problem was really related to reading the SYSTEM.STATS
table.
There were only 8,000 rows in the table, but COUNT took more than 10
minutes. I noticed that the storage files (34) had a total size of
10 GB.

DELETE FROM SYSTEM.STATS did not help - the storage files are still
10 GB,
and COUNT took a long time.
Then I truncated the table from the hbase shell. And this fixed the
problem - after UPDATE STATS for each table, everything works fine.

Are there any known issues with SYSTEM.STATS table? Apache Phoenix
4.13.1
with 15 Region Servers.

-Original Message-
    From: Josh Elser [mailto:els...@apache.org <mailto:els...@apache.org>]
Sent: Tuesday, September 17, 2019 5:16 PM
To: user@phoenix.apache.org <mailto:user@phoenix.apache.org>
Subject: Re: Performance degradation on query analysis

Can you share the output you see from the EXPLAIN? Does it differ
between
times it's "fast" and times it's "slow"?

Sharing the table(s) DDL statements would also help, along with the
shape
and version of your cluster (e.g. Apache Phoenix 4.14.2 with 8
RegionServers).

Spit-balling ideas:

Could be reads over the SYSTEM.CATALOG table or the SYSTEM.STATS table.

Have you looked more coarsely at the RegionServer logs/metrics? Any
obvious
saturation issues (e.g. handlers consumed, JVM GC pauses, host CPU
saturation)?

Turn on DEBUG log4j client side (beware of chatty ZK logging) and see if
there's something obvious from when the EXPLAIN is slow.

On 9/17/19 3:58 AM, Stepan Migunov wrote:
 > Hi
 > We have an issue with our production environment - from time to
time we
 > notice a significant performance degradation for some queries.
The strange
 > thing is that the EXPLAIN operator for these queries takes the
same time
 > as queries execution (5 minutes or more). So, I guess, the issue is
 > related to query's analysis but not data extraction. Is it
possible that
 > issue is related to SYSTEM.STATS access problem? Any other ideas?
 >



Re: What is the phoenix-queryserver-client-4.14.2-HBase-1.4.jar?

2019-09-24 Thread Josh Elser
Please be aware that you're referencing code that hasn't been updated in 
3 years. Take it with a grain of salt.


On 9/19/19 6:42 PM, jesse wrote:
Josh, in your sample project pom.xml file, the following build 
dependence is not needed:



org.apache.phoenix
phoenix-server-client
4.7.0-HBase-1.1-SNAPSHOT



On Thu, Sep 19, 2019, 10:53 AM jesse <mailto:chat2je...@gmail.com>> wrote:


A) phoenix-4.14.2-HBase-1.4-thin-client.jar

Just A) is good enough, Josh, you had a sample program here:
https://github.com/joshelser/phoenix-queryserver-jdbc-client

And the phoenix-4.14.2-HBase-1.4-thin-client.jar already contains
the org.apache.phoenix.queryserver.client.Driver






On Thu, Sep 19, 2019, 8:54 AM jesse mailto:chat2je...@gmail.com>> wrote:

You confused me more, if I write a Java program with http
endpoint to PQS for Phoenix read/write functions, should I depend on

A) phoenix-4.14.2-HBase-1.4-thin-client.jar

B) phoenix-queryserver-client-4.14.2-HBase-1.4.jar

C) both



On Thu, Sep 19, 2019, 4:12 AM Josh Elser mailto:els...@apache.org>> wrote:

"phoenix-queryserver-client" is the name of the Maven module
which holds
the required code for the "JDBC thin client", aka PQS
client, aka
"queryserver client".

Maven convention is that a jar with the name of the Maven
module is created.

However, the majority of the code for the thin client is
pulled from
another Apache project. In fact, we only have one piece of
code that we
maintain client-side to connect to PQS.

That third party code _does_ need to be included on the
classpath for
you to use the client. Thus, a shaded jar is created, with the
human-readable name "thin-client" to make it very clear to
you that this
is the jar the use.

The Maven build shows how all of this work.

On 9/18/19 8:04 PM, jesse wrote:
 > It seems it is just the sqllinewrapper client, so
confusing name...
 >
 >
 >
 > On Wed, Sep 18, 2019, 4:46 PM jesse mailto:chat2je...@gmail.com>
 > <mailto:chat2je...@gmail.com
<mailto:chat2je...@gmail.com>>> wrote:
 >
 >     For query via PQS, we are using
phoenix-4.14.2-HBase-1.4-thin-client.jar
 >
 >     Then what is purpose and usage
 >     of phoenix-queryserver-client-4.14.2-HBase-1.4.jar?
 >
 >     Thanks
 >



Re: What is the phoenix-queryserver-client-4.14.2-HBase-1.4.jar?

2019-09-24 Thread Josh Elser

Yes. Also, this is what I said originally:

> the human-readable name "thin-client" to make it very clear to you 
that this is the jar the use.


We try to be consistent everywhere with the phrase "thin client" to 
indicate what you use to interact with PQS.


On 9/19/19 1:53 PM, jesse wrote:

A) phoenix-4.14.2-HBase-1.4-thin-client.jar

Just A) is good enough, Josh, you had a sample program here:
https://github.com/joshelser/phoenix-queryserver-jdbc-client

And the phoenix-4.14.2-HBase-1.4-thin-client.jar already contains the 
org.apache.phoenix.queryserver.client.Driver







On Thu, Sep 19, 2019, 8:54 AM jesse <mailto:chat2je...@gmail.com>> wrote:


You confused me more, if I write a Java program with http endpoint
to PQS for Phoenix read/write functions, should I depend on

A) phoenix-4.14.2-HBase-1.4-thin-client.jar

B) phoenix-queryserver-client-4.14.2-HBase-1.4.jar

C) both



    On Thu, Sep 19, 2019, 4:12 AM Josh Elser mailto:els...@apache.org>> wrote:

"phoenix-queryserver-client" is the name of the Maven module
which holds
the required code for the "JDBC thin client", aka PQS client, aka
"queryserver client".

Maven convention is that a jar with the name of the Maven module
is created.

However, the majority of the code for the thin client is pulled
from
another Apache project. In fact, we only have one piece of code
that we
maintain client-side to connect to PQS.

That third party code _does_ need to be included on the
classpath for
you to use the client. Thus, a shaded jar is created, with the
human-readable name "thin-client" to make it very clear to you
that this
is the jar the use.

The Maven build shows how all of this work.

On 9/18/19 8:04 PM, jesse wrote:
 > It seems it is just the sqllinewrapper client, so confusing
name...
 >
 >
 >
 > On Wed, Sep 18, 2019, 4:46 PM jesse mailto:chat2je...@gmail.com>
 > <mailto:chat2je...@gmail.com <mailto:chat2je...@gmail.com>>>
wrote:
 >
 >     For query via PQS, we are using
phoenix-4.14.2-HBase-1.4-thin-client.jar
 >
 >     Then what is purpose and usage
 >     of phoenix-queryserver-client-4.14.2-HBase-1.4.jar?
 >
 >     Thanks
 >



Re: What is the phoenix-queryserver-client-4.14.2-HBase-1.4.jar?

2019-09-19 Thread Josh Elser
"phoenix-queryserver-client" is the name of the Maven module which holds 
the required code for the "JDBC thin client", aka PQS client, aka 
"queryserver client".


Maven convention is that a jar with the name of the Maven module is created.

However, the majority of the code for the thin client is pulled from 
another Apache project. In fact, we only have one piece of code that we 
maintain client-side to connect to PQS.


That third party code _does_ need to be included on the classpath for 
you to use the client. Thus, a shaded jar is created, with the 
human-readable name "thin-client" to make it very clear to you that this 
is the jar the use.


The Maven build shows how all of this work.

On 9/18/19 8:04 PM, jesse wrote:

It seems it is just the sqllinewrapper client, so confusing name...



On Wed, Sep 18, 2019, 4:46 PM jesse > wrote:


For query via PQS, we are using phoenix-4.14.2-HBase-1.4-thin-client.jar

Then what is purpose and usage
of phoenix-queryserver-client-4.14.2-HBase-1.4.jar?

Thanks



Re: Performance degradation on query analysis

2019-09-17 Thread Josh Elser
Can you share the output you see from the EXPLAIN? Does it differ 
between times it's "fast" and times it's "slow"?


Sharing the table(s) DDL statements would also help, along with the 
shape and version of your cluster (e.g. Apache Phoenix 4.14.2 with 8 
RegionServers).


Spit-balling ideas:

Could be reads over the SYSTEM.CATALOG table or the SYSTEM.STATS table.

Have you looked more coarsely at the RegionServer logs/metrics? Any 
obvious saturation issues (e.g. handlers consumed, JVM GC pauses, host 
CPU saturation)?


Turn on DEBUG log4j client side (beware of chatty ZK logging) and see if 
there's something obvious from when the EXPLAIN is slow.


On 9/17/19 3:58 AM, Stepan Migunov wrote:

Hi
We have an issue with our production environment - from time to time we notice 
a significant performance degradation for some queries. The strange thing is 
that the EXPLAIN operator for these queries takes the same time as queries 
execution (5 minutes or more). So, I guess, the issue is related to query's 
analysis but not data extraction. Is it possible that issue is related to 
SYSTEM.STATS access problem? Any other ideas?



Re: Need help on steps to copy Phoenix table from HDP 2.6 to HDP 3.1

2019-09-13 Thread Josh Elser

I reached out to Sam in private since he's dealing non-Apache releases.

His solution was to bulk load files from the old version of Phoenix to 
the new (HDP 2.6 to 3.1, respectively), but when he ran a query in the 
new system, he did not see the records despite them being there in HBase.


The problem was that the column encoding feature was introduced between 
these two versions, so the data was not in the correct column families 
for the newer version of Phoenix. Following his same process, but 
creating the table with COLUMN_ENCODED_BYTES=0, resulted in the data 
being visible from Phoenix queries.


On 9/11/19 9:14 AM, Sam Glover wrote:

Hi Phoenix users group,

We need help on steps to copy Phoenix table from HDP 2.6 to HDP 3.1

All replies appreciated.
--

*Sam Glover* | Solutions Architect
t. (512) 550-5363
cloudera.com 

Cloudera 


Cloudera on Twitter 	Cloudera on Facebook 
	Cloudera on LinkedIn 











Re: PSQ processlist

2019-09-10 Thread Josh Elser
As you might already know, JDBC is "stateful" in what it does. You have 
a Connection, which creates Statements, and the combination of those two 
track queries being run.


However, HTTP is a stateless protocol. As such, PQS has to cache things 
in memory in order to make this approach work.


To this point, there are Connection and Statement caches which, once 
they are not used, are evicted from the cache and closed[1]. I know that 
Phoenix is not capable of interrupting/free'ing all resources used by a 
Phoenix query (e.g. you cannot interrupt an HBase RPC once it's 
running), but it's likely that Phoenix would clean up the client-side 
state to the best of its ability when the Statement/Connection are closed.


Maybe someone knows the answer to that off the top of their head. 
Otherwise, hopefully this information is a starting point for you to 
look at the code and/or run some experiments.


[1] https://phoenix.apache.org/server.html "Configurations relating to 
the server connection cache."


On 9/10/19 11:28 AM, Aleksandr Saraseka wrote:

Thank you Josh, this is very helpful.
Another question - can we kill long running query in PQS somehow ?

On Mon, Sep 9, 2019 at 5:09 PM Josh Elser <mailto:els...@apache.org>> wrote:


Not unique to PQS, see:

https://issues.apache.org/jira/browse/PHOENIX-2715

On 9/9/19 9:02 AM, Aleksandr Saraseka wrote:
 > Hello.
 > Does Phoenix Query Server have any possibility to track running
queries
 > ? Like user connects with thin client and run some long running
query,
 > can I understand who and what is running ?
 >
 > --
 >               Aleksandr Saraseka
 > DBA
 > 380997600401
 >  *•* asaras...@eztexting.com
<mailto:asaras...@eztexting.com>
 > <mailto:asaras...@eztexting.com <mailto:asaras...@eztexting.com>>
*•* eztexting.com <http://eztexting.com>
 >

<http://eztexting.com?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >
 >
 >

<http://facebook.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >

<http://linkedin.com/company/eztexting/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >

<http://twitter.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >

<https://www.youtube.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >

<https://www.instagram.com/ez_texting/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >

<https://www.facebook.com/alex.saraseka?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >

<https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>
 >



--
Aleksandr Saraseka
DBA
380997600401
 *•* asaras...@eztexting.com 
<mailto:asaras...@eztexting.com> *•* eztexting.com 
<http://eztexting.com?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> 



<http://facebook.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> 
<http://linkedin.com/company/eztexting/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> 
<http://twitter.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> 
<https://www.youtube.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> 
<https://www.instagram.com/ez_texting/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> 
<https://www.facebook.com/alex.saraseka?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> 
<https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>




Re: PSQ processlist

2019-09-09 Thread Josh Elser

Not unique to PQS, see:

https://issues.apache.org/jira/browse/PHOENIX-2715

On 9/9/19 9:02 AM, Aleksandr Saraseka wrote:

Hello.
Does Phoenix Query Server have any possibility to track running queries 
? Like user connects with thin client and run some long running query, 
can I understand who and what is running ?


--
Aleksandr Saraseka
DBA
380997600401
 *•* asaras...@eztexting.com 
 *•* eztexting.com 
 



 
 
 
 
 
 





Re: Multi-Tenancy and shared records

2019-09-03 Thread Josh Elser

Hi Simon,

Phoenix does not provide any authorization/security layers on top of 
what HBase does (the thread on user@hbase has a suggestion on cell ACLs 
which is good).


I think the question you're ultimately asking is: no, the TenantID is 
not an authorization layer. In a nut-shell, the TenantID is just an 
extra attribute (column) added to your primary key constraint 
auto-magically. If a user doesn't set a TenantID, then they see _all_ data.


Unless you have a layer in-between Phoenix and your end-users that add 
extra guarantees/restrictions, a user could set their own TenantID and 
see other folks' data. I don't think this is a good solution for what 
you're trying to accomplish.


On 9/2/19 8:34 PM, Simon Mottram wrote:

Hi

I'm working on a project where we have a combination of very sparse
data columns with added headaches of multi-tenancy.  Hbase looks great
for the back end but I need to check that we can support the customer's
multi-tenancy requirements.

There are 2 that I'm struggling to find a definitive answer for. Any
info most gratefully received

Shared Data
===
Each record in the table must be secured but it could be multiple
tenants for a record.  Think 'shared' data.

So for example if you had 3 records

record1, some secret data
record2, some other secret data
record3, data? what data.

We need
user1 to be able to see record1 and record2
user2 to be able to see record2 and record3

 From what I see in the mult-tenancy doco, the tenant_id field is a
VARCHAR,  can this be multiple values?

The actual 'multiple tenant' value would be set at creation and very
rarely (if ever) changed, but I couldn't guarantee immutability


Enforced Security
=
Can you prevent access without TenantId?  Otherwise if someone just
edits the connection info they can sidestep all the multi-tenancy
features.   Our users include scientific types who will want to connect
directly using JDBC/Python/Other so we need to be sure to lock this
data down.

Of course they want 'admin' types who CAN see all =) Whether there is a
special connection that allows non-tenanted connections or have a
multi-tenant key that always contains a master tenantid (yuck)

If not possible I guess we have to look at doing something at the HBase
level.

Best Regards

Simon



Re: Any reason for so small phoenix.mutate.batchSize by default?

2019-09-03 Thread Josh Elser

Hey Alexander,

Was just poking at the code for this: it looks like this is really just 
determining the number of mutations that get "processed together" (as 
opposed to a hard limit).


Since you have done some work, I'm curious if you could generate some 
data to help back up your suggestion:


* What does your table DDL look like?
* How large is one mutation you're writing (in bytes)?
* How much data ends up being sent to a RegionServer in one RPC?

You're right in that we would want to make sure that we're sending an 
adequate amount of data to a RegionServer in an RPC, but this is tricky 
to balance for all cases (thus, setting a smaller value to avoid sending 
batches that are too large is safer).


On 9/3/19 8:03 AM, Alexander Batyrshin wrote:

  Hello all,

1) There is bug in documentation - http://phoenix.apache.org/tuning.html
phoenix.mutate.batchSize is not 1000, but only 100 by default
https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/query/QueryServicesOptions.java#L164
Changed for https://issues.apache.org/jira/browse/PHOENIX-541


2) I want to discuss this default value. From PHOENIX-541 
 I read about issue 
with MR and wide rows (2MB per row) and it looks like rare case. But in 
most common cases we can get much better write perfomance with batchSize 
= 1000 especially if it used with SALT table


Re: Is there a way to specify split num or reducer num when creating phoenix table ?

2019-08-29 Thread Josh Elser
Configuring salt buckets is not the same thing as pre-splitting a table. 
You should not be setting a crazy large number of buckets like you are.


If you want more parallelism in the MapReduce job, pre-split along 
date-boundaries, with the salt bucket taken into consideration (e.g. 
\x00_date, \x01_date, \x02_date).


HBase requires that a file to be bulk-loaded fit inside of a single 
region. A Reducer will only generate data for a single Region (as a 
Reducer can only generate one file). Create more regions, and you will 
get more parallelism.


On 8/29/19 4:11 AM, you Zhuang wrote:

I have a chronological series of data. Data row like dt, r1 ,r2 ,r3 ,r4 ,r5 ,r6 
,d1 ,d2 ,d3 ,d4 , d5 …

And dt is format  as  20190829 , increasing monotonically, such as 
20190830,20190831...

The query pattern is some like select * from table where dt between 20180620 
and  20190829 and r3 = ? And r6 = ?;

Dt is mandatory, remain filter is some random combination of r1 to r6, selected 
columns are always  all columns *.


I have made dt,r1,r2,… r6 to be compound primary key. The create table clause 
is below:

CREATE TABLE app.table(
  Dt integer not null ,
  R1 integer not null,
  R2 integer not null,
  R3 integer not null,
  R4 integer not null,
  R5 integer not null,
  R6 integer not null,

  D1 decimal(30,6),
  D2 decimal(30,6),
  D3 decimal(30,6),
  D4 decimal(30,6),
  D5 decimal(30,6),
  D6 decimal(30,6)


  CONSTRAINT pk PRIMARY KEY (dt,r1,r2,r3,r4,r5,r6)
) SALT_BUCKETS = 3,UPDATE_CACHE_FREQUENCY = 30,COMPRESSION = 'SNAPPY',  
SPLIT_POLICY = 
'org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy', 
MAX_FILESIZE = '50’;

I have 3 region server so I determine SALT_BUCKETS = 3.

But when I initially load table data with csvbulkload tool , the  dt ranges 
from  20180620 to 20190829, data size is about 1T,

Csvbulkload map reduce shows 3 partitions for reducer, It  always failed due to 
so small partitions.

I increase SALT_BUCKETS = 512, but max SALT_BUCKETS = 256, I set it to 256 but 
not works.



I know I can split on (…)  when creating table, but I don’t know how to 
determine the point , and hundreds of points is scaring.


So Is there a way to specify split num or reducer num when creating phoenix 
table ?

I will be expecting any advice for tuning this scenario.



Re: On duplicate key update

2019-08-26 Thread Josh Elser
Out of the box, Phoenix will provide the same semantics that HBase does 
for concurrent updates to a (data) table.


https://hbase.apache.org/acid-semantics.html

If you're also asking about how index tables remain in sync, the answer 
is a bit more complicated (and has changed in recent versions).


On 8/26/19 2:51 PM, Jon Strayer wrote:
How does atomic update work with multiple clients?  Assuming that there 
is no matching record to begin with the access won’t be locked.  It 
seems like two threads could write conflicting data since they both see 
no existing record (NER).  Is that correct? Or is there something that 
will serialize the writes so that only one of them sees the NER state?




Re: Is there any way to using appropriate index automatically?

2019-08-22 Thread Josh Elser
Sorry 'bout that. Missed that you were doing a local index. Thanks for 
catching my slack, Ankit and Vincent.


On 8/20/19 4:49 AM, you Zhuang wrote:
Er, I also read the sentence “Unlike global indexes, local indexes 
/will/ use an index even when all columns referenced in the query are 
not contained in the index. This is done by default for local indexes 
because we know that the table and index data co-reside on the same 
region server thus ensuring the lookup is local.”


I ‘m totally confused.


On Aug 20, 2019, at 12:32 AM, Josh Elser <mailto:els...@apache.org>> wrote:


http://phoenix.apache.org/faq.html#Why_isnt_my_secondary_index_being_used

On 8/19/19 6:06 AM, you Zhuang wrote:

Phoenix-version: 4.14.3-HBase-1.4-SNAPSHOT
hbase-version: 1.4.6
Table:
CREATE TABLE test_phoenix.app (
dt integer not null,
a bigint not null ,
b bigint not null ,
c bigint not null ,
d bigint not null ,
e bigint not null ,
f bigint not null ,
g bigint not null ,
h bigint not null ,
i bigint not null ,
j bigint not null ,
k bigint not null ,
m decimal(30,6) ,
n decimal(30,6)
CONSTRAINT pk PRIMARY KEY (dt, a,b,c,d,e,f,g,h,i,j,k)
) SALT_BUCKETS = 3,UPDATE_CACHE_FREQUENCY = 30;
Index:
CREATE local INDEX local_c_h_index ON test_phoenix.app (c,h) ASYNC;
(Has been filled data with bulkload and index is active)
Query:
select /*+ INDEX(test_phoenix.app local_c_h_index) */ * from 
TEST_PHOENIX.APP where c=2 and h = 1 limit 5;

select * from TEST_PHOENIX.APP where c=2 and h = 1 limit 5;
The first query will use index local_c_h_index and result shortly, 
the second query won’t , and response slowly.

The explain plan is weird, all showing without using index.
On Aug 19, 2019, at 5:54 PM, Aleksandr Saraseka 
<mailto:asaras...@eztexting.com><mailto:asaras...@eztexting.com>> wrote:


We have no problems with that. I mean indexes are used even without 
hints, if they're suitable for a query.
Maybe you can share your Phoenix version, query, index definition 
and exec plan ?


On Mon, Aug 19, 2019 at 12:46 PM you Zhuang 
<mailto:zhuangzixiao...@gmail.com><mailto:zhuangzixiao...@gmail.com>> wrote:


   Yeah, I mean no hint , use appropriate index automatically. I
   create a local index  and a query with corresponding index column
   filter in where clause. But the query doesn’t use index, with
   index hint it uses it.



--
Aleksandr Saraseka
DBA
380997600401
 *•*asaras...@eztexting.com 
<mailto:asaras...@eztexting.com><mailto:asaras...@eztexting.com> 
*•*eztexting.com 
<http://eztexting.com/><http://eztexting.com/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>


<http://facebook.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> 
<http://linkedin.com/company/eztexting/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> 
<http://twitter.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> 
<https://www.youtube.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> 
<https://www.instagram.com/ez_texting/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> 
<https://www.facebook.com/alex.saraseka?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> 
<https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>




Re: Is there any way to using appropriate index automatically?

2019-08-19 Thread Josh Elser

http://phoenix.apache.org/faq.html#Why_isnt_my_secondary_index_being_used

On 8/19/19 6:06 AM, you Zhuang wrote:

Phoenix-version: 4.14.3-HBase-1.4-SNAPSHOT
hbase-version: 1.4.6
Table:
CREATE TABLE test_phoenix.app (
dt integer not null,
a bigint not null ,
b bigint not null ,
c bigint not null ,
d bigint not null ,
e bigint not null ,
f bigint not null ,
g bigint not null ,
h bigint not null ,
i bigint not null ,
j bigint not null ,
k bigint not null ,
m decimal(30,6) ,
n decimal(30,6)
CONSTRAINT pk PRIMARY KEY (dt, a,b,c,d,e,f,g,h,i,j,k)
) SALT_BUCKETS = 3,UPDATE_CACHE_FREQUENCY = 30;

Index:
CREATE local INDEX local_c_h_index ON test_phoenix.app (c,h) ASYNC;
(Has been filled data with bulkload and index is active)

Query:
select /*+ INDEX(test_phoenix.app local_c_h_index) */ * from 
TEST_PHOENIX.APP where c=2 and h = 1 limit 5;

select * from TEST_PHOENIX.APP where c=2 and h = 1 limit 5;

The first query will use index local_c_h_index and result shortly, the 
second query won’t , and response slowly.


The explain plan is weird, all showing without using index.



On Aug 19, 2019, at 5:54 PM, Aleksandr Saraseka 
mailto:asaras...@eztexting.com>> wrote:


We have no problems with that. I mean indexes are used even without 
hints, if they're suitable for a query.
Maybe you can share your Phoenix version, query, index definition and 
exec plan ?


On Mon, Aug 19, 2019 at 12:46 PM you Zhuang > wrote:


Yeah, I mean no hint , use appropriate index automatically. I
create a local index  and a query with corresponding index column
filter in where clause. But the query doesn’t use index, with
index hint it uses it.



--
Aleksandr Saraseka
DBA
380997600401
 *•* asaras...@eztexting.com 
 *•* eztexting.com 
 



 
 
 
 
 
 







Re: java.io.IOException: Added a key not lexically larger than previous

2019-08-15 Thread Josh Elser
Are you using a local index? Can you share the basics please (HBase and 
Phoenix versions).


I'm not seeing if you've shared this previously on this or another 
thread. Sorry if you have.


Short-answer, it's possible that something around secondary indexing in 
Phoenix causes this but not possible to definitively say in a vaccuum.


On 8/15/19 1:19 PM, Alexander Batyrshin wrote:

Is is possible that Phoenix is the reason of this problem?


On 20 Jun 2019, at 04:16, Alexander Batyrshin <0x62...@gmail.com> wrote:

Hello,
Are there any ideas where this problem comes from and how to fix?

Jun 18 21:38:05 prod022 hbase[148581]: 2019-06-18 21:38:05,348 WARN  
[MemStoreFlusher.0] regionserver.HStore: Failed flushing store file, retrying 
num=9
Jun 18 21:38:05 prod022 hbase[148581]: java.io.IOException: Added a key not 
lexically larger than previous. Current cell = 
\x0D100395583733fW+,WQ/d:p/1560882798036/DeleteColumn/vlen=0/seqid=30023231, 
lastCell = \x0D100395583733fW+,WQ/d:p/1560882798036/Put/vlen=29/seqid=30023591
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.io.hfile.AbstractHFileWriter.checkKey(AbstractHFileWriter.java:204)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:279)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:1053)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:139)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:969)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2484)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2622)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2352)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2314)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2200)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2125)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:512)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:482)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:76)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:264)
Jun 18 21:38:05 prod022 hbase[148581]: at 
java.lang.Thread.run(Thread.java:748)
Jun 18 21:38:05 prod022 hbase[148581]: 2019-06-18 21:38:05,373 FATAL 
[MemStoreFlusher.0] regionserver.HRegionServer: ABORTING region server 
prod022,60020,1560521871613: Replay of WAL required. Forcing server shutdown
Jun 18 21:38:05 prod022 hbase[148581]: 
org.apache.hadoop.hbase.DroppedSnapshotException: region: 
TBL_C,\x0D04606203096428+jaVbx.,1558885224779.b4633aee06956663b05e8322ce34b0a3.
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2675)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2352)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2314)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2200)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2125)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:512)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:482)
Jun 18 21:38:05 prod022 hbase[148581]: at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:76)
Jun 18 21:38:05 prod022 hbase[148581]: at 

Re: Phoenix with multiple HBase masters

2019-08-08 Thread Josh Elser

PQS (really, Avatica[1]) doesn't have a health-check endpoint.

We would need to figure out what it means for Avatica to check its own 
health, and export such an endpoint.


[1] https://calcite.apache.org/avatica/docs/

On 8/7/19 4:19 PM, jesse wrote:
After further investigation, I figured there are two types of clients: 
thin client and thick client.


- thin client talks to PQS, only ELB or PQS http address is supported
- thick client supports ZK

I don't know the PQS heath url yet, not documented anywhere. Any one knows?



On Wed, Aug 7, 2019, 12:06 PM Aleksandr Saraseka 
mailto:asaras...@eztexting.com>> wrote:


I didn't try this personally, but according to the documentation
https://phoenix.apache.org/server.html it has possibility to have
ZK-based load-balancing. Please refer to the "Load balancing"
section at the bottom.

On Wed, Aug 7, 2019 at 9:14 PM jesse mailto:chat2je...@gmail.com>> wrote:

Thank you all, very helpful information.

1) for server side ELB, what is the PQS health check url path?

2) Does Phoenix client library support client-side
load-balancing? i. e client gets list of PQS addresses from ZK,
and performs load balancing. Thus ELB won't be needed.



On Wed, Aug 7, 2019, 9:01 AM Josh Elser mailto:els...@apache.org>> wrote:

Great answer, Aleksandr!

Also worth mentioning there is only ever one active HBase
Master at a
time. If you have multiple started, one will be active as
the master and
the rest will be waiting as a standby in case the current
active master
dies for some reason (expectedly or unexpectedly).

On 8/7/19 9:55 AM, Aleksandr Saraseka wrote:
 > Hello.
 > - Phoenix libs should be installed only on RegionServers.
 > - QueryServer - it's up to you, PQS can be installed
anywhere you want
 > - No. QueryServer is using ZK quorum to get everything it
needs
 > - If you need to balance traffic with multiply PQSs -
then yes, but
 > again - it's up to you. It is not required multiply PQSs
if you have
 > multiply HBase masters.
 >
 > On Wed, Aug 7, 2019 at 12:58 AM jesse
mailto:chat2je...@gmail.com>
 > <mailto:chat2je...@gmail.com
<mailto:chat2je...@gmail.com>>> wrote:
 >
 >     Our cluster used to have one hbase master, now a
secondary is added.
 >     For phonenix, what changes should we make?
 >       - do we have to install new hbase libraries on the
new hbase
 >     master node?
 >     - do we need to install new query server on the hbase
master?
 >     - any configuration changes should we make?
 >     - do we need an ELB for the query server?
 >
 >     Thanks
 >
 >
 >
 >
 >
 > --
 >               Aleksandr Saraseka
 > DBA
 > 380997600401
 >  *•* asaras...@eztexting.com
<mailto:asaras...@eztexting.com>
 > <mailto:asaras...@eztexting.com
<mailto:asaras...@eztexting.com>> *•* eztexting.com
<http://eztexting.com>
 >

<http://eztexting.com?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >
 >
 >

<http://facebook.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >

<http://linkedin.com/company/eztexting/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >

<http://twitter.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >

<https://www.youtube.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >

<https://www.instagram.com/ez_texting/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >

<https://www.facebook.com/alex.saraseka?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>

 >

<https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>
 >



-- 
		Aleksandr Saraseka

DBA
380997600401
 *•* asaras...@eztexting.com
<mailto:asaras...@eztexting.com> *•* 

Re: Phoenix with multiple HBase masters

2019-08-07 Thread Josh Elser

Great answer, Aleksandr!

Also worth mentioning there is only ever one active HBase Master at a 
time. If you have multiple started, one will be active as the master and 
the rest will be waiting as a standby in case the current active master 
dies for some reason (expectedly or unexpectedly).


On 8/7/19 9:55 AM, Aleksandr Saraseka wrote:

Hello.
- Phoenix libs should be installed only on RegionServers.
- QueryServer - it's up to you, PQS can be installed anywhere you want
- No. QueryServer is using ZK quorum to get everything it needs
- If you need to balance traffic with multiply PQSs - then yes, but 
again - it's up to you. It is not required multiply PQSs if you have 
multiply HBase masters.


On Wed, Aug 7, 2019 at 12:58 AM jesse > wrote:


Our cluster used to have one hbase master, now a secondary is added.
For phonenix, what changes should we make?
  - do we have to install new hbase libraries on the new hbase
master node?
- do we need to install new query server on the hbase master?
- any configuration changes should we make?
- do we need an ELB for the query server?

Thanks





--
Aleksandr Saraseka
DBA
380997600401
 *•* asaras...@eztexting.com 
 *•* eztexting.com 
 



 
 
 
 
 
 





Re: Phoenix client threads

2019-08-06 Thread Josh Elser
Please take a look at the documentation: 
https://phoenix.apache.org/tuning.html


On 7/29/19 4:24 AM, Sumanta Gh wrote:

Hi,
When we use Phoenix client, there are by default 10 new 
PHOENIX-SCANNER-RENEW-LEASE-threads created.
There are also new threads spawned for hconnection-shared-thread-pool. 
This thread pool goes upto having max 256 threads.

How can these thread pool sizes be configured using configuration?


Regards
Sumanta

=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you



Re: Phoenix Upgrade 4.7 to 4.14 - Cannot use Phoenix

2019-08-06 Thread Josh Elser
Looks like this is a bug in the system table upgrade code path which 
doesn't handle the jump from 4.7 to 4.14 correctly. This big of a 
version jump is not tested/supported in Apache.


Does CLABS give you a guarantee that this will work? It would likely be 
good to contact Cloudera support if you need help on this.


On 7/28/19 10:10 PM, Alexander Lytchier wrote:

Hi,

Following an update of Cloudera Manager and CDH from 5.7 to 5.14, and 
Phoenix from 4.7.0 (CLABS) to 4.14.0, sqlline.py is no longer working.


Apart from the upgrade of Phoenix I also created the table 
/SYSTEM.MUTEX/ in HBase since it’s not used in 4.7.0, based on 
http://apache-phoenix-user-list.1124778.n5.nabble.com/Error-upgrading-from-from-4-7-x-to-4-13-x-td4210.html


$ ./sqlline.py localhost:2181:/hbase

Setting property: [incremental, false]

Setting property: [isolation, TRANSACTION_READ_COMMITTED]

issuing: !connect jdbc:phoenix:localhost:2181:/hbase none none 
org.apache.phoenix.jdbc.PhoenixDriver


Connecting to jdbc:phoenix:localhost:2181:/hbase

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in 
[jar:file:/opt/cloudera/parcels/APACHE_PHOENIX-4.14.0-cdh5.14.2.p0.3/lib/phoenix/phoenix-4.14.0-cdh5.14.2-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]


SLF4J: Found binding in 
[jar:file:/opt/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]


SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation.


SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

19/07/26 17:40:07 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes 
where applicable


19/07/26 17:40:32 WARN query.ConnectionQueryServicesImpl: Starting 
restore of SYSTEM.CATALOG using snapshot 
SNAPSHOT_SYSTEM.CATALOG_4.7.x_TO_4.14.0_20190726174011 because upgrade 
failed


19/07/26 17:40:37 WARN query.ConnectionQueryServicesImpl: Successfully 
restored SYSTEM.CATALOG using snapshot 
SNAPSHOT_SYSTEM.CATALOG_4.7.x_TO_4.14.0_20190726174011


19/07/26 17:40:42 WARN query.ConnectionQueryServicesImpl: Successfully 
restored and enabled SYSTEM.CATALOG using snapshot 
SNAPSHOT_SYSTEM.CATALOG_4.7.x_TO_4.14.0_20190726174011


Error: ERROR 1012 (42M03): Table undefined. tableName=MYTABLE 
(state=42M03,code=1012)


org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): 
Table undefined. tableName=MYTABLE


     at 
org.apache.phoenix.compile.FromCompiler$BaseColumnResolver.createTableRef(FromCompiler.java:577)


     at 
org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.(FromCompiler.java:391)


     at 
org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.(FromCompiler.java:383)


     at 
org.apache.phoenix.compile.FromCompiler.getResolver(FromCompiler.java:263)


     at 
org.apache.phoenix.compile.CreateIndexCompiler.compile(CreateIndexCompiler.java:50)


     at 
org.apache.phoenix.jdbc.PhoenixStatement$ExecutableCreateIndexStatement.compilePlan(PhoenixStatement.java:1073)


     at 
org.apache.phoenix.jdbc.PhoenixStatement$ExecutableCreateIndexStatement.compilePlan(PhoenixStatement.java:1059)


     at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:401)


     at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:391)


     at 
org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)


     at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:389)


     at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)


     at 
org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1825)


     at 
org.apache.phoenix.util.UpgradeUtil.upgradeLocalIndexes(UpgradeUtil.java:456)


     at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.upgradeSystemCatalogIfRequired(ConnectionQueryServicesImpl.java:2899)


     at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.upgradeSystemTables(ConnectionQueryServicesImpl.java:3050)


     at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2584)


     at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2491)


     at 
org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)


     at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2491)


     at 
org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)


     at 
org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)


     

Re: Secondary Indexes - Missing Data in Phoenix

2019-07-25 Thread Josh Elser
Local indexes are stored in the same table as the data. They are "local" 
to the data.


I would not be surprised if you are running into issues because you are 
using such an old version of Phoenix.


On 7/24/19 10:35 PM, Alexander Lytchier wrote:

Hi,

We are currently using Cloudera as a package manager for our Hadoop 
Cluster with Phoenix 4.7.0 (CLABS_PHOENIX)and HBase 1.2.0-cdh5.7.6. 
Phoenix 4.7.0 appears to be the latest version supported 
(http://archive.cloudera.com/cloudera-labs/phoenix/parcels/latest/) even 
though it’s old.


The table in question has a binary row-key: pk BINARY(30): 1 Byte for 
salting, 8 Bytes - timestamp (Long), 20 Bytes - hash result of other 
record fields. + 1 extra byte for unknown issue about updating schema in 
future (not sure if relevant). We are currently facing performance 
issues and are attempting to mitigate it by adding secondary indexes.


When generating a local index synchronously with the following command:

CREATE LOCAL INDEX INDEX_TABLE ON “MyTable” (“cf”.”type”);

I can see that the resulting index table in Phoenix is populated, in 
HBase I can see the row-key of the index table and queries work as expected:


\x00\x171545413\x00 column=cf:cf:type, timestamp=1563954319353, 
value=1545413


\x00\x00\x00\x01b\xB2s\xDB

@\x1B\x94\xFA\xD4\x14c\x0B

d$\x82\xAD\xE6\xB3\xDF\x06

\xC9\x07@\xB9\xAE\x00

However, for the case where the index is created asynchronously, and 
then populated using the IndexTool, with the following commands:



CREATE LOCAL INDEX INDEX_TABLE ON “MyTable” (“cf”.”type”) ASYNC;

sudo -u hdfs HADOOP_CLASSPATH=`hbase classpath` hadoop jar 
/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/hbase/bin/../lib/hbase-client-1.2.0-cdh5.7.1.jar 
org.apache.phoenix.mapreduce.index.IndexTool --data-table "MyTable" 
--index-table INDEX_TABLE --output-path hdfs://nameservice1/


I get the following row-key in HBase:


\x00\x00\x00\x00\x00\x00\x column=cf:cf:type, timestamp=1563954000238, 
value=1545413


00\x00\x00\x00\x00\x00\x00

\x00\x00\x00\x00\x00\x00\x

00\x00\x00\x00\x00\x00\x00

\x00\x00\x00\x00\x00\x00\x

151545413\x00\x00\x

00\x00\x01b\xB2s\xDB@\x1B\

x94\xFA\xD4\x14c\x0Bd$\x82

\xAD\xE6\xB3\xDF\x06\xC9\x

07@\xB9\xAE\x00

It is has 32 additional 0-bytes (\x00). Why is there a difference – is 
one expected? What’s more, the index table in Phoenix is empty (I guess 
it’s not able to read the underlying HBase index table with that key?), 
so any queries that use the local index in Phoenix return no value.


Do you have any suggestions? We must use the /async /method to populate 
the index table on production because of the massive amounts of data, 
but if Phoenix is not able to read the index table it cannot be used for 
queries.


Is it possible this issue has been fixed in a newer version?

Thanks



Re: Alter Table throws java.lang.NullPointerException

2019-07-24 Thread Josh Elser

Please start by sharing the version of Phoenix that you're using.

Did you search Jira to see if there was someone else who also reported 
this issue?


On 7/23/19 4:24 PM, Alexander Batyrshin wrote:

Hello all,

Got this:

alter table TEST_TABLE SET APPEND_ONLY_SCHEMA=true;
java.lang.NullPointerException
 at 
org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3240)
 at 
org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3221)
 at 
org.apache.phoenix.jdbc.PhoenixStatement$ExecutableAddColumnStatement$1.execute(PhoenixStatement.java:1432)
 at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:408)
 at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:391)
 at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
 at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:390)
 at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)
 at 
org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1825)
 at sqlline.Commands.execute(Commands.java:822)
 at sqlline.Commands.sql(Commands.java:732)
 at sqlline.SqlLine.dispatch(SqlLine.java:813)
 at sqlline.SqlLine.begin(SqlLine.java:686)
 at sqlline.SqlLine.start(SqlLine.java:398)
 at sqlline.SqlLine.main(SqlLine.java:291)


Any ideas how to fix?



Re: Spark sql query the hive external table mapped from phoenix always throw out Class org.apache.phoenix.hive.PhoenixSerDe not found exception

2019-07-11 Thread Josh Elser
(Moving this over to the user list as that's the appropriate list for 
this question)


Do you get an error? We can't help you with only a "it didn't work" :)

I'd suggest that you try to narrow down the scope of the problem: is it 
unique to Hive external tables? Can you use a different Hive 
StorageHandler successfully (e.g. the HBaseStorageHandler)?


Finally, as you're using HDP, please also consider using their customer 
support.


On 7/11/19 2:18 AM, 马士成 wrote:

Hello All,

In Apache Phoenix homepage,  It shows two additional functions: Apache 
Spark Integration and Phoenix Storage Handler for Apache Hive,


According the guidance, I can query phoenix table from beeline-cli, I 
can load phoenix table as dataframe using Spark-sql.


So my question is :

Does Phoenix support spark-sql query the hive external table mapped from 
Phoenix ?


I am working on hdp3.0 ( Phoenix 5.0 Hbase 2.0, Hive 3.1.0 ,Spark2.3.1 
  )  and facing the issue as subject mentioned.


I tried to solve this problem but failed, I found some similar questions 
on internet but the answers didn’t work for me.


My submit command :

   spark-submit test3.py --jars \

   /usr/hdp/current/phoenix-client/lib/phoenix-hive-5.0.0.3.0.0.0-1634.jar\

   ,/usr/hdp/current/phoenix-client/lib/hadoop-mapreduce-client-core.jar\

   ,/usr/hdp/current/phoenix-client/lib/phoenix-core-5.0.0.3.0.0.0-1634.jar\

   
,/usr/hdp/current/phoenix-client/lib/phoenix-spark-5.0.0.3.0.0.0-1634.jar\


   ,/usr/hdp/current/hive-client/lib/hive-metastore-3.1.0.3.0.0.0-1634.jar\

   ,/usr/hdp/current/hive-client/lib/hive-common-3.1.0.3.0.0.0-1634.jar\

   ,/usr/hdp/current/hive-client/lib/hbase-client-2.0.0.3.0.0.0-1634.jar\

   ,/usr/hdp/current/hive-client/lib/hbase-mapreduce-2.0.0.3.0.0.0-1634.jar\

   ,/usr/hdp/current/hive-client/lib/hive-serde-3.1.0.3.0.0.0-1634.jar\

   ,/usr/hdp/current/hive-client/lib/hive-shims-3.1.0.3.0.0.0-1634.jar

Log attached and Demo code as below:

   from pyspark.sql import SparkSession

   if __name__ == '__main__':

   spark = SparkSession.builder \

   .appName("test") \

   .enableHiveSupport() \

   .getOrCreate()

   df= spark.sql("select count(*) from ajmide_dw.part_device")

   df.show()

Similar Issues:

https://community.hortonworks.com/questions/140097/facing-issue-from-spark-sql.html

https://stackoverflow.com/questions/51501044/unable-to-access-hive-external-tables-from-spark-shell

Any comment or suggestion is appreciated!

Thanks,

Shi-Cheng, Ma



Re: Questions about ZK Load Balancer

2019-07-09 Thread Josh Elser

Yeah, that's correct.

I think I requested some documentation to be added by the original 
author to clarify that it's not end-to-end usable, but I don't think it 
ever happened. The "load balancer" isn't anything more than service 
advertisement, IIRC.


IMO, the write-up I made here[1] is going to give you something more 
usable out of the box.


If you have the time to invest in fixing this up, let's chat. We can 
make this story better.


[1] 
https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html


On 7/9/19 6:01 AM, Reid Chan wrote:

Hi community,

Recently, i'm trying to apply the ZK-based Load Balancer on production env.

But it looks like a half-done feature, i couldn't find how a query server 
client get a registered QS from LB in client side codebase.

There's one method: LoadBalancer#getSingleServiceLocation, supposed to be 
called from client side, is dead codes and never invoked.

Highly appreciate any code pointer or suggestion or advice.


--

Best regards,
R.C




Re: Curl kerberized QueryServer using protobuf type

2019-07-02 Thread Josh Elser

Hey Reid,

Protobuf is a binary format -- this is error'ing out because you're 
sending it plain-text.


You're going to have quite a hard time constructing messages in bash 
alone. There are lots of language bindings[1]. You should be able to 
pick any of these to help encode/decode messages (if you want to use 
cURL as your "transport").


IMO, Avatica's protocol is too complex (by necessity of implementing all 
of the JDBC API) to just throw some hand-constructed JSON at. I think 
the better solution would be to think about some simpler API that 
exposes just the bare bones if you want something developer focused.


[1] https://developers.google.com/protocol-buffers/docs/reference/overview

On 7/1/19 5:53 AM, Reid Chan wrote:

Hi team and other users,

Following is the script used for connecting to QS,
{code}
#!/usr/bin/env bash

set -u

AVATICA="hostname:8765"
echo $AVATICA
CONNECTION_ID="conn-$(whoami)-$(date +%s)"

echo "Open connection"
openConnectionReq="message OpenConnectionRequest {string connection_id = 
$CONNECTION_ID;}"
curl -i --negotiate -u : -w "\n" "$AVATICA" -H "Content-Type: application/protobuf" 
--data "$openConnectionReq"
{code}

But it ended with:
org.apache.calcite.avatica.proto.Responses$ErrorResponse�
�rg.apache.calcite.avatica.com.google.protobuf.InvalidProtocolBufferException$InvalidWireTypeException:
 Protocol message tag had invalid wire type.
at 
org.apache.calcite.avatica.com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:111)
at 
org.apache.calcite.avatica.com.google.protobuf.CodedInputStream$ArrayDecoder.skipField(CodedInputStream.java:591)
at 
org.apache.calcite.avatica.proto.Common$WireMessage.(Common.java:12544)
at 
org.apache.calcite.avatica.proto.Common$WireMessage.(Common.java:12511)
at 
org.apache.calcite.avatica.proto.Common$WireMessage$1.parsePartialFrom(Common.java:13054)
at 
org.apache.calcite.avatica.proto.Common$WireMessage$1.parsePartialFrom(Common.java:13049)
at 
org.apache.calcite.avatica.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:91)
at 
org.apache.calcite.avatica.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:96)
at 
org.apache.calcite.avatica.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
at 
org.apache.calcite.avatica.com.google.protobuf.GeneratedMessageV3.parseWithIOException(GeneratedMessageV3.java:311)
at 
org.apache.calcite.avatica.proto.Common$WireMessage.parseFrom(Common.java:12757)
at 
org.apache.calcite.avatica.remote.ProtobufTranslationImpl.parseRequest(ProtobufTranslationImpl.java:410)
at 
org.apache.calcite.avatica.remote.ProtobufHandler.decode(ProtobufHandler.java:51)
at 
org.apache.calcite.avatica.remote.ProtobufHandler.decode(ProtobufHandler.java:31)
at 
org.apache.calcite.avatica.remote.AbstractHandler.apply(AbstractHandler.java:93)
at 
org.apache.calcite.avatica.remote.ProtobufHandler.apply(ProtobufHandler.java:46)
at 
org.apache.calcite.avatica.server.AvaticaProtobufHandler$2.call(AvaticaProtobufHandler.java:123)
at 
org.apache.calcite.avatica.server.AvaticaProtobufHandler$2.call(AvaticaProtobufHandler.java:121)
at 
org.apache.phoenix.queryserver.server.QueryServer$PhoenixDoAsCallback$1.run(QueryServer.java:500)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
at 
org.apache.phoenix.queryserver.server.QueryServer$PhoenixDoAsCallback.doAsRemoteUser(QueryServer.java:497)
at 
org.apache.calcite.avatica.server.HttpServer$Builder$1.doAsRemoteUser(HttpServer.java:884)
at 
org.apache.calcite.avatica.server.AvaticaProtobufHandler.handle(AvaticaProtobufHandler.java:120)
at 
org.apache.phoenix.shaded.org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:542)
at 
org.apache.phoenix.shaded.org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at 
org.apache.phoenix.shaded.org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at 
org.apache.phoenix.shaded.org.eclipse.jetty.server.Server.handle(Server.java:499)
at 
org.apache.phoenix.shaded.org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)
at 
org.apache.phoenix.shaded.org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at 
org.apache.phoenix.shaded.org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
at 
org.apache.phoenix.shaded.org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at 
org.apache.phoenix.shaded.org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at 

Re: A strange question about Phoenix

2019-06-20 Thread Josh Elser
Make sure you have updated statistics for your table. Depending on the 
last time you created the stats, you may have to manually delete the 
stats from SYSTEM.STATS (as there are safeguards to prevent re-creating 
statistics too frequently).


There have been some bugs in the past that results from invalid stats 
guideposts.


On 6/19/19 3:25 PM, jesse wrote:

1) hbase clone-snapshot into my_table
2) sqlline.py zk:port console  to create my_table.

Very straight forward.



On Wed, Jun 19, 2019, 11:40 AM anil gupta > wrote:


Sounds strange.
What steps you followed to restore snapshot of Phoenix table?

On Tue, Jun 18, 2019 at 9:34 PM jesse mailto:chat2je...@gmail.com>> wrote:

hi:

  When my table is restored via hbase clone-snapshot,

1) sqlline.py console shows the proper number of records: 
select count(*) from my_table.

2) select my_column from my_table limit 1 works fine.

  However, select * from my_table limit 1; returns no row.

  Do I need to perform some extra operations?

  thanks








-- 
Thanks & Regards,

Anil Gupta



Re: Phoenix 4 to 5 Upgrade Path

2019-06-14 Thread Josh Elser
IIRC, the guidance is that 4.14 should work back to 4.12. We're talking 
about minor releases in this case, not bugfix. Every bugfix release 
should be compatible within the current minor release. This is one of 
the tenants of https://semver.org/


You should not have issues going to 5.0.0, but I don't know of anyone 
who has explicitly tested this. You should proceed with caution.


On 6/14/19 2:07 AM, Vova Galchenko wrote:

Hey Josh,

thanks for getting back to me. The current Phoenix version we're on is 
4.14.1. Do I understand correctly that the Phoenix community intends to 
provide data compatibility at least two versions back? Does this 
intention apply across major version boundaries? More specifically, does 
it imply that data produced by 4.14.1 is intended to be compatible with 
4.14.2 and 5.0.0?


Thank you.

vova

On Wed, Jun 12, 2019 at 1:18 PM Josh Elser <mailto:els...@apache.org>> wrote:


What version of Phoenix 4 are you coming from?

Of note, if you're lagging far behind, you'll get bit by the column
encoding turning on by default in 4.10 [1]

In general, before we update the system catalog table, we take a
snapshot of it, so you can roll back (although this would be
manual). In
terms of testing as a community, we only do testing back two releases.
After that, your mileage may vary.

- Josh

[1] https://phoenix.apache.org/columnencoding.html

On 6/11/19 9:02 PM, Vova Galchenko wrote:
 > Hello Phoenix Users List!
 >
 > We at Box are thinking about the upgrade story from Phoenix 4 to
5. As
 > part of that, we'd like to understand if these Phoenix versions
write
 > data in formats compatible with each other. In other words,
suppose we
 > have an HBase 1.4 cluster used by Phoenix 4. Can I shut
everything down,
 > upgrade HBase to 2.0 and Phoenix to 5, start everything up again,
and
 > expect to preserve the integrity of the data stored in the original
 > cluster? If not, is there guidance for what the upgrade process
might
 > look like?
 >
 > Thanks,
 >
 > vova



Re: Phoenix 4 to 5 Upgrade Path

2019-06-12 Thread Josh Elser

What version of Phoenix 4 are you coming from?

Of note, if you're lagging far behind, you'll get bit by the column 
encoding turning on by default in 4.10 [1]


In general, before we update the system catalog table, we take a 
snapshot of it, so you can roll back (although this would be manual). In 
terms of testing as a community, we only do testing back two releases. 
After that, your mileage may vary.


- Josh

[1] https://phoenix.apache.org/columnencoding.html

On 6/11/19 9:02 PM, Vova Galchenko wrote:

Hello Phoenix Users List!

We at Box are thinking about the upgrade story from Phoenix 4 to 5. As 
part of that, we'd like to understand if these Phoenix versions write 
data in formats compatible with each other. In other words, suppose we 
have an HBase 1.4 cluster used by Phoenix 4. Can I shut everything down, 
upgrade HBase to 2.0 and Phoenix to 5, start everything up again, and 
expect to preserve the integrity of the data stored in the original 
cluster? If not, is there guidance for what the upgrade process might 
look like?


Thanks,

vova


Re: Problem with ROW_TIMESTAMP

2019-06-10 Thread Josh Elser
When you want to use Phoenix to query your data, you're going to have a 
much better time if you also use Phoenix to load the data.


Unless you specifically know what you're doing (and how to properly 
serialize the data into HBase so that Phoenix can read it), you should 
use Phoenix to both read and write your data.


On 6/7/19 1:31 PM, David Auclair wrote:

Hi,

I’m having a problem with ROW_TIMESTAMP not producing proper output:

Versions (Based on HDP 2.6.5):

HBase Shell = Version 1.1.2.2.6.5.1100-53 (assuming HBase is the same 
version?)


Phoenix = phoenix-4.7.0.2.6.5.1100-53-client.jar

Phoenix Sqlline = 1.1.8

In the HBase shell, I create a table & cf:

hbase(main):001:0> create "threatintel","t"

Once I’ve loaded some data, I can see:

column=t:v, timestamp=1559914430391, value={"ip":"222.102.76.151"}

Via Phoenix I’m trying to create a mapping to the existing table:

create table "threatintel" (ts TIMESTAMP NOT NULL, "t"."v" varchar, 
CONSTRAINT pk PRIMARY KEY (ts ROW_TIMESTAMP));


Via Phoenix sqlline:

Select * from “threatintel” limit 1;

Results in:

| 292264620-05-14 12:29:06.783  | {"ip":"203.198.118.221"}  |

Pretty sure that timestamp is incorrect. (And no, that wasn’t the same 
datapoint, but the timestamps were all within a few seconds of each 
other from a bulk import)


Did I do something wrong?  Any other info I can provide?

Thanks in advance,

David Auclair



Re: Fwd: DELIVERY FAILURE: Error transferring to QCMBSJ601.HERMES.SI.SOCGEN; Maximum hop count exceeded. Message probably in a routing loop.

2019-05-28 Thread Josh Elser
https://issues.apache.org/jira/browse/INFRA-18170 still pending. Need to 
poke.


On 5/23/19 2:49 PM, William Shen wrote:

Josh,

Any luck on getting Infra to help, or finding a moderator? I'm still 
getting these spam... (Who are the moderators anyway?)


Thanks,

- Will

On Fri, Apr 5, 2019 at 3:38 PM William Shen <mailto:wills...@marinsoftware.com>> wrote:


Thanks Josh for looking into this!

On Fri, Apr 5, 2019 at 12:52 PM Josh Elser mailto:els...@apache.org>> wrote:

I can't do this right now, but I've asked Infra to give me the
karma I
need to remove this person.

If a moderator of user@phoenix is watching this, they can use
the tool
on Whimsy[1] to remove this subscriber.

- Josh

[1] https://whimsy.apache.org/committers/moderationhelper.cgi

On 4/5/19 3:47 PM, Josh Elser wrote:
 > Not just you. I have an email filter set up for folks like
this :)
 >
 > A moderator should be able to force a removal. Trying it now
to see if
 > I'm a moderator (I don't think I am, but might be able to add
myself as
 > one).
 >
 > On 4/4/19 7:15 PM, William Shen wrote:
 >> I kept getting this every time I send to the users list. Can
we force
 >> remove the subscriber (martin.pernollet-...@sgcib.com
<mailto:martin.pernollet-...@sgcib.com>
 >> <mailto:martin.pernollet-...@sgcib.com
<mailto:martin.pernollet-...@sgcib.com>>) ? Or is this only
happening
 >> to me?
 >>
 >> Thanks
 >>
 >> - Will
 >>
 >> -- Forwarded message -
 >> From: mailto:postmas...@sgcib.com>
<mailto:postmas...@sgcib.com <mailto:postmas...@sgcib.com>>>
 >> Date: Thu, Apr 4, 2019 at 12:17 PM
 >> Subject: DELIVERY FAILURE: Error transferring to
 >> QCMBSJ601.HERMES.SI.SOCGEN; Maximum hop count exceeded. Message
 >> probably in a routing loop.
 >> To: mailto:wills...@marinsoftware.com>
<mailto:wills...@marinsoftware.com
<mailto:wills...@marinsoftware.com>>>
 >>
 >>
 >> Your message
 >>
 >>    Subject: Using Hint with PhoenixRDD
 >>
 >> was not delivered to:
 >>
 >> martin.pernollet-...@sgcib.com
<mailto:martin.pernollet-...@sgcib.com>
<mailto:martin.pernollet-...@sgcib.com
<mailto:martin.pernollet-...@sgcib.com>>
 >>
 >> because:
 >>
 >>    Error transferring to QCMBSJ601.HERMES.SI.SOCGEN; Maximum
hop count
 >> exceeded.  Message probably in a routing loop.
 >>
 >>
 >>
 >>
 >> -- Forwarded message --
 >> From: wills...@marinsoftware.com
<mailto:wills...@marinsoftware.com>
<mailto:wills...@marinsoftware.com
<mailto:wills...@marinsoftware.com>>
 >> To: user@phoenix.apache.org <mailto:user@phoenix.apache.org>
<mailto:user@phoenix.apache.org <mailto:user@phoenix.apache.org>>
 >> Cc:
 >> Bcc:
 >> Date: Thu, 4 Apr 2019 12:16:48 -0700
 >> Subject: Using Hint with PhoenixRDD
 >> Hi all,
 >>
 >> Do we have any way of passing in hints when querying Phoenix
using
 >> PhoenixRDD in Spark? I reviewed the implementation of
PhoenixRDD
 >> and PhoenixRecordWritable, but was not able to find an
obvious way to
 >> do so. Is it supported?
 >>
 >> Thanks in advance!
 >>
 >> - Will
 >>
 >>

*
 >> This message and any attachments (the “message”) are
confidential,
 >> intended solely for the addressee(s), and may contain legally
 >> privileged information.
 >> Any unauthorised use or dissemination is prohibited.
 >> If you are not the intended recipient please notify us
immediately by
 >> telephoning or e-mailing the sender.
 >> You should not copy this e-mail or use it for any purpose
nor disclose
 >> its contents to any other person.
 >> E-mails are susceptible to alteration. Neither SOCIETE
GENERALE nor
 >> 

Re: PQS + Kerberos problems

2019-05-28 Thread Josh Elser

Make sure you have authorization set up correctly between PQS and HBase.

Specifically, you must have the appropriate Hadoop proxyuser rules set 
up in core-site.xml so that HBase will allow PQS to impersonate the PQS 
end-user.


On 5/14/19 11:04 AM, Aleksandr Saraseka wrote:

Hello, I have HBase + PQS 4.14.1
If I'm trying to connect by think client - everything works, but if I'm 
using thin client in PQS logs I can see continuous INFO messages
2019-05-14 13:53:58,701 INFO 
org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, 
tries=10, retries=35, started=48292 ms ago, cancelled=false, msg=

...
2019-05-14 14:18:41,446 INFO 
org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, 
tries=33, retries=35, started=510325 ms ago, cancelled=false, msg=
2019-05-14 14:19:01,489 INFO 
org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, 
tries=34, retries=35, started=530368 ms ago, cancelled=false, msg=

...
2019-05-14 14:18:41,446 INFO 
org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, 
tries=33, retries=35, started=510325 ms ago, cancelled=false, msg=
2019-05-14 14:19:01,489 INFO 
org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, 
tries=34, retries=35, started=530368 ms ago, cancelled=false, msg=
2019-05-14 14:19:50,139 INFO 
org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, 
tries=10, retries=35, started=48480 ms ago, cancelled=false, msg=row 
'SYSTEM:CATALOG,,' on table 'hbase:meta' at 
region=hbase:meta,,1.1588230740, hostname=datanode-001.fqdn.com 
,60020,1557323271824, seqNum=0
2019-05-14 14:20:10,333 INFO 
org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, 
tries=11, retries=35, started=68676 ms ago, cancelled=false, msg=row 
'SYSTEM:CATALOG,,' on table 'hbase:meta' at 
region=hbase:meta,,1.1588230740, hostname=datanode-001.fqdn.com 
,60020,1557323271824, seqNum=0


*Hbase security logs:*
2019-05-14 14:42:19,524 INFO 
SecurityLogger.org.apache.hadoop.hbase.Server: Auth successful for 
HTTP/phoenix-queryserver-fqdn@realm.com 
 (auth:KERBEROS)
2019-05-14 14:42:19,524 INFO 
SecurityLogger.org.apache.hadoop.hbase.Server: Connection from 
10.252.16.253 port: 41040 with version info: version: "1.2.0-cdh5.14.2" 
url: 
"file:///data/jenkins/workspace/generic-binary-tarball-and-maven-deploy/CDH5.14.2-Packaging-HBase-2018-03-27_13-15-05/hbase-1.2.0-cdh5.14.2" 
revision: "Unknown" user: "jenkins" date: "Tue Mar 27 13:31:54 PDT 2018" 
src_checksum: "05e6e90e06dd7796f56067208a9bf2aa"
2019-05-14 14:42:29,634 INFO 
SecurityLogger.org.apache.hadoop.hbase.Server: Auth successful for 
HTTP/phoenix-queryserver-fqdn@realm.com 
 (auth:KERBEROS)
2019-05-14 14:42:29,635 INFO 
SecurityLogger.org.apache.hadoop.hbase.Server: Connection from 
10.252.16.253 port: 41046 with version info: version: "1.2.0-cdh5.14.2" 
url: 
"file:///data/jenkins/workspace/generic-binary-tarball-and-maven-deploy/CDH5.14.2-Packaging-HBase-2018-03-27_13-15-05/hbase-1.2.0-cdh5.14.2" 
revision: "Unknown" user: "jenkins" date: "Tue Mar 27 13:31:54 PDT 2018" 
src_checksum: "05e6e90e06dd7796f56067208a9bf2aa"



*thin client logs:*
19/05/14 14:10:08 DEBUG execchain.MainClientExec: Proxy auth state: 
UNCHALLENGED

19/05/14 14:10:08 DEBUG http.headers: http-outgoing-0 >> POST / HTTP/1.1
19/05/14 14:10:08 DEBUG http.headers: http-outgoing-0 >> Content-Length: 137
19/05/14 14:10:08 DEBUG http.headers: http-outgoing-0 >> Content-Type: 
application/octet-stream
19/05/14 14:10:08 DEBUG http.headers: http-outgoing-0 >> Host: 
host-fqdn.com:8765 
19/05/14 14:10:08 DEBUG http.headers: http-outgoing-0 >> Connection: 
Keep-Alive
19/05/14 14:10:08 DEBUG http.headers: http-outgoing-0 >> User-Agent: 
Apache-HttpClient/4.5.2 (Java/1.8.0_161)
19/05/14 14:10:08 DEBUG http.headers: http-outgoing-0 >> 
Accept-Encoding: gzip,deflate
19/05/14 14:10:08 DEBUG http.headers: http-outgoing-0 >> Authorization: 
Negotiate 

NoSQL Day on May 21st in Washington D.C.

2019-05-09 Thread Josh Elser
For those of you in/around the Washington D.C. area, NoSQL day is fast 
approaching.


If you've not already signed up, please check out the agenda and 
consider joining us for a fun and technical day with lots of talks from 
Apache committers and big names in industry:


https://dataworkssummit.com/nosql-day-2019/

For those still on the fence, please the code NSD50 to get 50% off the 
registration fee.


Thanks and see you there!

- Josh


Re: Phoenix Mapreduce

2019-04-30 Thread Josh Elser
No, you will not "lose" data. You will just have mappers that read from 
more than one Region (and thus, more than one RegionServer). The hope in 
this approach is that we can launch Mappers on the same node of the 
RegionServer hosting your Region and avoid any reading any data over the 
network.


This is just an optimization.

On 4/30/19 10:12 AM, Shawn Li wrote:

Hi,

The number of Map in Phoenix Mapreduce is determined by table region 
number. My question is: if the region is split due to other injection 
process while Phoenix Mapreduce job is running, do we lose reading some 
data due to this split? As now we have more regions than maps, and the 
maps only have region information before split.


Thanks,
Shawn


Re: Large differences in query execution time for similar queries

2019-04-22 Thread Josh Elser
Further, I'd try to implement James' suggestions _not_ using the Phoenix 
Query Server. Remember that the thin-client uses PQS, adding a level of 
indirection and re-serialization.


By using the "thick" driver, you can avoid this overhead which will help 
you get repeatable test results with less moving pieces.


On 4/17/19 11:30 AM, James Taylor wrote:

Hi Hieu,
You could try add the /*+ SERIAL */ hint to see if that has any impact. 
Also, have you tried not salting the table? The SALT_BUCKETS value of 
128 is pretty high.


For the other issue, do you have a lot of deleted cells? You might try 
running a major compaction. You might try adding a secondary index 
on "doubleCol" if that's a common query.


Thanks,
James

On Thu, Apr 11, 2019 at 5:44 PM Hieu Nguyen > wrote:


Hi,

I am using Phoenix 4.14-cdh5.11, with sqlline-thin as the client.  I
am seeing strange patterns around SELECT query execution time:
1. Increasing the LIMIT past a certain "threshold" results in
significantly slower execution time.
2. Adding just one column (BIGINT) to the SELECT results in
significantly slower execution time.

This is our schema (names are changed for readability):
CREATE TABLE "metadata" (
   "pk"                           VARCHAR PRIMARY KEY
)
SALT_BUCKETS = 128,
COLUMN_ENCODED_BYTES = 0,
BLOOMFILTER = 'ROWCOL',
COMPRESSION = 'GZ';

CREATE VIEW "extended" (
"doubleCol" DOUBLE,
"intCol" BIGINT,
"intCol2" BIGINT,
"intCol3" BIGINT,
"stringCol" VARCHAR,
"stringCol2" VARCHAR,
"stringCol3" VARCHAR,
"stringCol4" VARCHAR,
"stringCol5" VARCHAR,
"stringCol6" VARCHAR,
"stringCol7" VARCHAR,
) AS SELECT * FROM "metadata"

We have other views created that also select from "metadata" that
define their own columns.  Overall, there are 1 million rows in this
table, and 20k rows match the condition "doubleCol" > 100.

Base query:
SELECT
"pk","doubleCol","intCol","intCol2","stringCol","stringCol2","intCol3"
FROM "templatealldatattype-7d55c5a6-efe3-419d-9bce-9fea7c14f8bc"
WHERE "doubleCol" > 100
LIMIT 1
-> 1.976 seconds

Decreasing LIMIT to 9500 (only 5% decrease in number of rows):
SELECT
"pk","doubleCol","intCol","intCol2","stringCol","stringCol2","intCol3"
FROM "templatealldatattype-7d55c5a6-efe3-419d-9bce-9fea7c14f8bc"
WHERE "doubleCol" > 100
LIMIT 9500
-> 0.409 seconds

Removing "intCol3" from SELECT, keeping LIMIT at 1:
SELECT "pk","doubleCol","intCol","intCol2","stringCol","stringCol2"
FROM "templatealldatattype-7d55c5a6-efe3-419d-9bce-9fea7c14f8bc"
WHERE "doubleCol" > 100
LIMIT 1
-> 0.339 seconds

I ran each of these queries a few times in a row.  There was small
variation in execution time, but the 2nd and 3rd queries never were
slower than the 1st query.

The EXPLAIN plan did not change, except the ROW LIMIT value when
explaining the 2nd query (9500 instead of 1).

++-+++
|                                                PLAN   
                                 | EST_BYTES_READ  | EST_ROWS_READ 
|  EST_INFO_TS   |


++-+++
| CLIENT 128-CHUNK 382226 ROWS 314572800 BYTES PARALLEL 128-WAY
ROUND ROBIN FULL SCAN OVER metadata  | 314572800       | 382226 
    | 1554973434637  |
|     SERVER FILTER BY "doubleCol" > 100.0 
                                  | 314572800       | 382226   
  | 1554973434637  |
|     SERVER 1 ROW LIMIT   
                                  | 314572800       | 382226   
  | 1554973434637  |
| CLIENT 1 ROW LIMIT   
                                  | 314572800       | 382226   
  | 1554973434637  |


++-+++

I tried adding the SEEK_TO_COLUMN and NO_SEEK_TO_COLUMN hints as
suggested in a similar thread

(https://lists.apache.org/thread.html/4ef8384ecd31f30fdaf5837e3abc613142426d899e916c7aae4a46d4@%3Cuser.phoenix.apache.org%3E),
but they had no effect.

Any pointers to how we can investigate the 4-5x slowdown when
increasing LIMIT by only ~5% or when selecting just one more BIGINT
column?  Could we have exceeded some threshold in the result size
that caused the query to perform a lot slower for seemingly small
changes in the query?

Thanks,
-Hieu



Re: Fwd: DELIVERY FAILURE: Error transferring to QCMBSJ601.HERMES.SI.SOCGEN; Maximum hop count exceeded. Message probably in a routing loop.

2019-04-05 Thread Josh Elser
I can't do this right now, but I've asked Infra to give me the karma I 
need to remove this person.


If a moderator of user@phoenix is watching this, they can use the tool 
on Whimsy[1] to remove this subscriber.


- Josh

[1] https://whimsy.apache.org/committers/moderationhelper.cgi

On 4/5/19 3:47 PM, Josh Elser wrote:

Not just you. I have an email filter set up for folks like this :)

A moderator should be able to force a removal. Trying it now to see if 
I'm a moderator (I don't think I am, but might be able to add myself as 
one).


On 4/4/19 7:15 PM, William Shen wrote:
I kept getting this every time I send to the users list. Can we force 
remove the subscriber (martin.pernollet-...@sgcib.com 
<mailto:martin.pernollet-...@sgcib.com>) ? Or is this only happening 
to me?


Thanks

- Will

-- Forwarded message -
From: mailto:postmas...@sgcib.com>>
Date: Thu, Apr 4, 2019 at 12:17 PM
Subject: DELIVERY FAILURE: Error transferring to 
QCMBSJ601.HERMES.SI.SOCGEN; Maximum hop count exceeded. Message 
probably in a routing loop.

To: mailto:wills...@marinsoftware.com>>


Your message

   Subject: Using Hint with PhoenixRDD

was not delivered to:

martin.pernollet-...@sgcib.com <mailto:martin.pernollet-...@sgcib.com>

because:

   Error transferring to QCMBSJ601.HERMES.SI.SOCGEN; Maximum hop count 
exceeded.  Message probably in a routing loop.





-- Forwarded message --
From: wills...@marinsoftware.com <mailto:wills...@marinsoftware.com>
To: user@phoenix.apache.org <mailto:user@phoenix.apache.org>
Cc:
Bcc:
Date: Thu, 4 Apr 2019 12:16:48 -0700
Subject: Using Hint with PhoenixRDD
Hi all,

Do we have any way of passing in hints when querying Phoenix using 
PhoenixRDD in Spark? I reviewed the implementation of PhoenixRDD 
and PhoenixRecordWritable, but was not able to find an obvious way to 
do so. Is it supported?


Thanks in advance!

- Will

*
This message and any attachments (the “message”) are confidential, 
intended solely for the addressee(s), and may contain legally 
privileged information.

Any unauthorised use or dissemination is prohibited.
If you are not the intended recipient please notify us immediately by 
telephoning or e-mailing the sender.
You should not copy this e-mail or use it for any purpose nor disclose 
its contents to any other person.
E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor 
any of its subsidiaries or affiliates shall be liable for the message 
if altered, changed or falsified.
You agree to receive all communications by email or through dedicated 
websites.
Please visit https://cib.societegenerale.com/en/market-regulation/ for 
important information on MiFID, SFTR and with respect to derivative 
products.


Ce message et toutes les pieces jointes sont confidentiels et 
susceptibles de contenir des informations couvertes par le secret 
professionnel.
Ce message est etabli a l’attention exclusive de ses destinataires. 
Toute utilisation ou diffusion non autorisee est interdite.

Tout message electronique est susceptible d'alteration.
SOCIETE GENERALE et ses filiales declinent toute responsabilite en cas 
d’alteration, modification ou falsification de ce message.
Vous acceptez de recevoir nos communications par email ou via des 
sites web dedies.
Veuillez consulter le site 
https://cib.societegenerale.com/en/market-regulation/ pour des 
informations importantes sur MiFID, SFTR et sur les produits derives.

*



Re: Fwd: DELIVERY FAILURE: Error transferring to QCMBSJ601.HERMES.SI.SOCGEN; Maximum hop count exceeded. Message probably in a routing loop.

2019-04-05 Thread Josh Elser

Not just you. I have an email filter set up for folks like this :)

A moderator should be able to force a removal. Trying it now to see if 
I'm a moderator (I don't think I am, but might be able to add myself as 
one).


On 4/4/19 7:15 PM, William Shen wrote:
I kept getting this every time I send to the users list. Can we force 
remove the subscriber (martin.pernollet-...@sgcib.com 
) ? Or is this only happening to me?


Thanks

- Will

-- Forwarded message -
From: mailto:postmas...@sgcib.com>>
Date: Thu, Apr 4, 2019 at 12:17 PM
Subject: DELIVERY FAILURE: Error transferring to 
QCMBSJ601.HERMES.SI.SOCGEN; Maximum hop count exceeded. Message probably 
in a routing loop.

To: mailto:wills...@marinsoftware.com>>


Your message

   Subject: Using Hint with PhoenixRDD

was not delivered to:

martin.pernollet-...@sgcib.com 

because:

   Error transferring to QCMBSJ601.HERMES.SI.SOCGEN; Maximum hop count 
exceeded.  Message probably in a routing loop.





-- Forwarded message --
From: wills...@marinsoftware.com 
To: user@phoenix.apache.org 
Cc:
Bcc:
Date: Thu, 4 Apr 2019 12:16:48 -0700
Subject: Using Hint with PhoenixRDD
Hi all,

Do we have any way of passing in hints when querying Phoenix using 
PhoenixRDD in Spark? I reviewed the implementation of PhoenixRDD 
and PhoenixRecordWritable, but was not able to find an obvious way to do 
so. Is it supported?


Thanks in advance!

- Will

*
This message and any attachments (the “message”) are confidential, 
intended solely for the addressee(s), and may contain legally privileged 
information.

Any unauthorised use or dissemination is prohibited.
If you are not the intended recipient please notify us immediately by 
telephoning or e-mailing the sender.
You should not copy this e-mail or use it for any purpose nor disclose 
its contents to any other person.
E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any 
of its subsidiaries or affiliates shall be liable for the message if 
altered, changed or falsified.
You agree to receive all communications by email or through dedicated 
websites.
Please visit https://cib.societegenerale.com/en/market-regulation/ for 
important information on MiFID, SFTR and with respect to derivative 
products.


Ce message et toutes les pieces jointes sont confidentiels et 
susceptibles de contenir des informations couvertes par le secret 
professionnel.
Ce message est etabli a l’attention exclusive de ses destinataires. 
Toute utilisation ou diffusion non autorisee est interdite.

Tout message electronique est susceptible d'alteration.
SOCIETE GENERALE et ses filiales declinent toute responsabilite en cas 
d’alteration, modification ou falsification de ce message.
Vous acceptez de recevoir nos communications par email ou via des sites 
web dedies.
Veuillez consulter le site 
https://cib.societegenerale.com/en/market-regulation/ pour des 
informations importantes sur MiFID, SFTR et sur les produits derives.

*



2 weeks remaining for NoSQL Day abstract submission

2019-04-04 Thread Josh Elser
There are just *two weeks* remaining to submit abstracts for NoSQL Day 
2019, in Washington D.C. on May 21st. Abstracts are due April 19th.


https://dataworkssummit.com/nosql-day-2019/

Abstracts don't need to be more than a paragraph or two. Please the time 
sooner than later to submit your abstract. Of course, those talks which 
are selected will receive a complimentary pass to attend the event.


Please reply to a single user list or to me directly with any questions. 
Thanks!


- Josh


Re: split count for mapreduce jobs with PhoenixInputFormat

2019-01-30 Thread Josh Elser
Please do not take this advice lightly. Adding (or increasing) salt 
buckets can have a serious impact on the execution of your queries.


On 1/30/19 5:33 PM, venkata subbarayudu wrote:
You may recreate the table with salt_bucket table option to have 
reasonable regions and you may try having a secondary index to make the 
query run faster incase if your Mapreduce job performing specific filters


On Thu 31 Jan, 2019, 12:09 AM Thomas D'Silva <mailto:tdsi...@salesforce.com> wrote:


If stats are enabled PhoenixInputFormat will generate a split per
guidepost.

On Wed, Jan 30, 2019 at 7:31 AM Josh Elser mailto:els...@apache.org>> wrote:

You can extend/customize the PhoenixInputFormat with your own
code to
increase the number of InputSplits and Mappers.

On 1/30/19 6:43 AM, Edwin Litterst wrote:
 > Hi,
 > I am using PhoenixInputFormat as input source for mapreduce jobs.
 > The split count (which determines how many mappers are used
for the job)
 > is always equal to the number of regions of the table from
where I
 > select the input.
 > Is there a way to increase the number of splits? My job is
running too
 > slow with only one mapper for every region.
 > (Increasing the number of regions is no option.)
 > regards,
 > Eddie



Re: split count for mapreduce jobs with PhoenixInputFormat

2019-01-30 Thread Josh Elser
You can extend/customize the PhoenixInputFormat with your own code to 
increase the number of InputSplits and Mappers.


On 1/30/19 6:43 AM, Edwin Litterst wrote:

Hi,
I am using PhoenixInputFormat as input source for mapreduce jobs.
The split count (which determines how many mappers are used for the job) 
is always equal to the number of regions of the table from where I 
select the input.
Is there a way to increase the number of splits? My job is running too 
slow with only one mapper for every region.

(Increasing the number of regions is no option.)
regards,
Eddie


Re: Is it possible to do a dynamic deploy of a newer version of Phoenix coprocessor to specific tables?

2019-01-22 Thread Josh Elser
I was referring to the JDBC (thick) client and the coprocessors inside 
HBase. The thin JDBC client does not talk to HBase directly, only to PQS.


PQS and the thin-client would actually be the exception to what I said 
in my last message. You could (with a high degree of confidence) deploy 
a new version of Phoenix using an old thin JDBC driver. However, this 
isn't really any different than just upgrading Phoenix wholesale.


Yup, in short, you're glossing over quite a bit. One example: the 
(thick) JDBC must construct and send RPC messages to the appropriate 
RegionServers to execute certain operations. The deployed coprocessors 
in HBase must both know how to parse those RPC messages, but also 
interpret them correctly (e.g. an older CP might be able to parse a 
newer clients message, but could miss an important field that was added 
to that message).


On 1/22/19 4:02 AM, Owen Rees-Hayward wrote:

Hey Josh, thanks for your thoughts.

Based on your advice we will almost certainly not pursue this direction. 
But just to clarify, in terms of the client version are you referring to 
the Query server, JDBC clients or both?


I imagine from the JDBC perspective that a client would only be 
accessing tables with the same Phoenix version. But it maybe that my 
take has a lot of erroneous assumptions in it, as I haven't looked at 
the internals of the JDBC driver code.


On Mon, 21 Jan 2019 at 18:09, Josh Elser <mailto:els...@apache.org>> wrote:


Owen,

There would be significant "unwanted side-effects". You would be
taking on a very large burden trying to come up with a corresponding
client version of Phoenix which would still work against the newer
coprocessors that you are trying to deploy. Phoenix doesn't provide
any guarantee of compatibility for more than a few versions between
client and server.

Would suggest that you move to HDP 3.1.0 if you want a newer version
of Phoenix.

On Mon, Jan 21, 2019 at 9:06 AM Owen Rees-Hayward
mailto:owe...@googlemail.com>> wrote:
 >
 > We are on HDP 2.6.5 but would like to use a more recent version
of Phoenix, without upgrading it cluster-wide.
 >
 > HBase coprocessors can be dynamically deployed (for instance
picking up the coprocessor jar from HDFS) against specific tables.
We are wondering whether this would be a route to using a newer
version of Phoenix against a set of tables? We are unclear if there
would be unwanted side-effects.
 >
 > I'd be really interested to know if anyone has attempted this
with success or otherwise.
 >
 > Thanks in advance.
 >
 >
 > --
 > Owen Rees-Hayward
 > 07912 876046
 > twitter.com/owen4d <http://twitter.com/owen4d>



--
Owen Rees-Hayward
07912 876046
twitter.com/owen4d <http://twitter.com/owen4d>


Re: Is it possible to do a dynamic deploy of a newer version of Phoenix coprocessor to specific tables?

2019-01-21 Thread Josh Elser
Owen,

There would be significant "unwanted side-effects". You would be
taking on a very large burden trying to come up with a corresponding
client version of Phoenix which would still work against the newer
coprocessors that you are trying to deploy. Phoenix doesn't provide
any guarantee of compatibility for more than a few versions between
client and server.

Would suggest that you move to HDP 3.1.0 if you want a newer version of Phoenix.

On Mon, Jan 21, 2019 at 9:06 AM Owen Rees-Hayward  wrote:
>
> We are on HDP 2.6.5 but would like to use a more recent version of Phoenix, 
> without upgrading it cluster-wide.
>
> HBase coprocessors can be dynamically deployed (for instance picking up the 
> coprocessor jar from HDFS) against specific tables. We are wondering whether 
> this would be a route to using a newer version of Phoenix against a set of 
> tables? We are unclear if there would be unwanted side-effects.
>
> I'd be really interested to know if anyone has attempted this with success or 
> otherwise.
>
> Thanks in advance.
>
>
> --
> Owen Rees-Hayward
> 07912 876046
> twitter.com/owen4d


Re: Hbase vs Phienix column names

2019-01-08 Thread Josh Elser

(from the peanut-gallery)

That sounds to me like a useful utility to share with others if you're 
going to write it anyways, Anil :)


On 1/8/19 12:54 AM, Thomas D'Silva wrote:
There isn't an existing utility that does that. You would have to look 
up the COLUMN_QUALIFIER for the columns you are interested in from 
SYSTEM.CATALOG

and use then create a Scan.

On Mon, Jan 7, 2019 at 9:22 PM Anil > wrote:


Hi Team,

Is there any utility to read hbase data using hbase apis which is
created with phoniex with column name encoding ?

Idea is to use the all performance and disk usage improvements
achieved with phoenix column name encoding feature and use our
existing hbase jobs for our data analysis.

Thanks,
Anil

On Tue, 11 Dec 2018 at 14:02, Anil mailto:anilk...@gmail.com>> wrote:

Thanks.

On Tue, 11 Dec 2018 at 11:51, Jaanai Zhang
mailto:cloud.pos...@gmail.com>> wrote:

The difference since used encode column names that support
in 4.10 version(Also see PHOENIX-1598
).
You can config COLUMN_ENCODED_BYTES property to keep the
original column names in the create table SQL, an example for:

create table test(

id varcharprimary key,

col varchar

)COLUMN_ENCODED_BYTES =0 ;




    Jaanai Zhang
    Best regards!



Anil mailto:anilk...@gmail.com>> 于2018
年12月11日周二 下午1:24写道:

HI,

We have upgraded phoenix to Phoenix-4.11.0-cdh5.11.2
from phoenix 4.7.

Problem - When a table is created in phoenix, underlying
hbase column names and phoenix column names are
different. Tables created in 4.7 version looks good. Looks

CREATE TABLE TST_TEMP (TID VARCHAR PRIMARY KEY ,PRI
VARCHAR,SFLG VARCHAR,PFLG VARCHAR,SOLTO VARCHAR,BILTO
VARCHAR) COMPRESSION = 'SNAPPY';

0: jdbc:phoenix:dq-13.labs.> select TID,PRI,SFLG from
TST_TEMP limit 2;
+-++---+
|   TID       |    PRI     |    SFLG   |
+-++---+
| 0060189122  | 0.00       |           |
| 0060298478  | 13390.26   |           |
+-++---+


hbase(main):011:0> scan 'TST_TEMP', {LIMIT => 2}
ROW                                      COLUMN+CELL
  0060189122 
column=0:\x00\x00\x00\x00, timestamp=1544296959236, value=x
  0060189122 
column=0:\x80\x0B, timestamp=1544296959236, value=0.00
  0060298478 
column=0:\x00\x00\x00\x00, timestamp=1544296959236, value=x
  0060298478 
column=0:\x80\x0B, timestamp=1544296959236, value=13390.26



hbase columns names are completely different than
phoenix column names. This change observed only post
up-gradation. all existing tables created in earlier
versions looks good and alter statements to existing
tables also looks good.

Is there any workaround to avoid this difference? we
could not run hbase mapreduce jobs on hbase tables
created  by phoenix. Thanks.

Thanks








Re: slf4j class files in phoenix-5.0.0-HBase-2.0-client.jar

2018-12-28 Thread Josh Elser
If memory serves me correctly, no, you can't shade+relocate logging
classes. The way that log4j/slf4j "find" the logging classes breaks
down when you do this.

However, if I'm wrong as I very well could be, shading could be an
effective solution to this.

Alternatively, we could generate an artifact which does not provide
logging classes; however, I think we'd need to consider the burden in
creating, maintaining, and distributing an extra jar just to remove an
slf4j warning (unless Liang's problem extends beyond that -- not sure)

On Sat, Dec 22, 2018 at 10:39 PM Miles Spielberg  wrote:
>
> Could the classes be shaded into a different package to prevent conflicts 
> with libraries included by applications? HBase client has been doing this for 
> a while: 
> https://www.i-programmer.info/news/197-data-mining/11427-hbase-14-with-new-shaded-client.html
>
> Sent from my iPhone
>
> On Dec 22, 2018, at 9:25 AM, Josh Elser  wrote:
>
> This is as expected. JDBC expects that a database driver provide all of its 
> dependencies in a single jar file.
>
> On Mon, Dec 17, 2018 at 4:39 PM Liang Zhao  wrote:
>>
>> Hi,
>>
>>
>>
>> We found slf4j class files in your phoenix-5.0.0-HBase-2.0-client.jar, which 
>> caused multiple binding of slf4j as the Spring Boot in our project uses a 
>> different version. We can work around this, but conventionally, a jar file 
>> only need to name its transitive dependencies, instead of including them.
>>
>>
>>
>> The same problem is seen in phoenix-4.14.0-HBase-1.4-client.jar.
>>
>>
>>
>> Thanks
>>
>>
>>
>> Liang
>>
>>
>>
>> 
>>
>>


Re: slf4j class files in phoenix-5.0.0-HBase-2.0-client.jar

2018-12-22 Thread Josh Elser
This is as expected. JDBC expects that a database driver provide all of its
dependencies in a single jar file.

On Mon, Dec 17, 2018 at 4:39 PM Liang Zhao  wrote:

> Hi,
>
>
>
> We found slf4j class files in your phoenix-5.0.0-HBase-2.0-client.jar,
> which caused multiple binding of slf4j as the Spring Boot in our project
> uses a different version. We can work around this, but conventionally, a
> jar file only need to name its transitive dependencies, instead of
> including them.
>
>
>
> The same problem is seen in phoenix-4.14.0-HBase-1.4-client.jar.
>
>
>
> Thanks
>
>
>
> Liang
>
>
>
>
>


Re: client does not have phoenix.schema.isNamespaceMappingEnabled

2018-11-29 Thread Josh Elser

Why didn't it work?

The hbase-protocol.jar is insufficient to run MapReduce jobs against 
HBase; full stop. You're going to get lots of stuff pulled in via the 
phoenix-client.jar that you give to `hadoop jar`. That said, I can't 
think of a reason that including more jars on the classpath would be 
harmful.


Realistically, you might only need to provide HBASE_CONF_DIR to the 
HADOOP_CLASSPATH env variable, so that your mappers and reducers also 
get it on their classpath. The rest of the Java classes would be 
automatically localized via `hadoop jar`.


On 11/29/18 1:27 PM, M. Aaron Bossert wrote:
So, sorry for the super late reply...there is weird lag between the time 
a message is sent or received to this mailing list and when I actually 
see it...But, I have got it working now as follows:


HADOOP_CLASSPATH=/usr/hdp/3.0.1.0-187/hbase/lib/hbase-protocol.jar:/etc/hbase/3.0.1. 
<http://3.0.0.1/>0-187/0/ hadoop jar ...


using this did not work:

HADOOP_CLASSPATH="$(hbase mapredcp)" hadoop jar ...


the output of that command separately is this:

[user@server /somedir $] [mabossert@edge-3 lanl_data]$ hbase mapredcp

/usr/hdp/3.0.1.0-187/hbase/lib/hbase-shaded-protobuf-2.1.0.jar:/usr/hdp/3.0.1.0-187/zookeeper/zookeeper-3.4.6.3.0.1.0-187.jar:/usr/hdp/3.0.1.0-187/hbase/lib/htrace-core4-4.2.0-incubating.jar:/usr/hdp/3.0.1.0-187/hbase/lib/commons-lang3-3.6.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-server-2.0.0.3.0.1.0-187.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-protocol-shaded-2.0.0.3.0.1.0-187.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-hadoop2-compat-2.0.0.3.0.1.0-187.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-mapreduce-2.0.0.3.0.1.0-187.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-metrics-api-2.0.0.3.0.1.0-187.jar:/usr/hdp/3.0.1.0-187/hbase/lib/protobuf-java-2.5.0.jar:/usr/hdp/3.0.1.0-187/hbase/lib/metrics-core-3.2.1.jar:/usr/hdp/3.0.1.0-187/hbase/lib/jackson-databind-2.9.5.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-client-2.0.0.3.0.1.0-187.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-hadoop-compat-2.0.0.3.0.1.0-187.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-protocol-2.0.0.3.0.1.0-187.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-shaded-netty-2.1.0.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-shaded-miscellaneous-2.1.0.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-metrics-2.0.0.3.0.1.0-187.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-common-2.0.0.3.0.1.0-187.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-zookeeper-2.0.0.3.0.1.0-187.jar:/usr/hdp/3.0.1.0-187/hbase/lib/jackson-annotations-2.9.5.jar:/usr/hdp/3.0.1.0-187/hbase/lib/jackson-core-2.9.5.jar


On Tue, Nov 27, 2018 at 4:26 PM Josh Elser <mailto:els...@apache.org>> wrote:


To add a non-jar file to the classpath of a Java application, you must
add the directory containing that file to the classpath.

Thus, the following is wrong:

HADOOP_CLASSPATH=/usr/hdp/3.0.1.0-187/hbase/lib/hbase-protocol.jar:/etc/hbase/3.0.1.
<http://3.0.1.>0-187/0/hbase-site.xml

And should be:

HADOOP_CLASSPATH=/usr/hdp/3.0.1.0-187/hbase/lib/hbase-protocol.jar:/etc/hbase/3.0.1.
<http://3.0.1.>0-187/0/

Most times, including the output of `hbase mapredcp` is sufficient ala

HADOOP_CLASSPATH="$(hbase mapredcp)" hadoop jar ...

On 11/27/18 10:48 AM, M. Aaron Bossert wrote:
 > Folks,
 >
 > I have, I believe, followed all the directions for turning on
namespace
 > mapping as well as extra steps to (added classpath) required to
use the
 > mapreduce bulk load utility, but am still running into this
error...I am
 > running a Hortonworks cluster with both HDP v 3.0.1 and HDF
components.
 > Here is what I have tried:
 >
 >   * Checked that the proper hbase-site.xml (in my case:
 >     /etc/hbase/3.0.1.0-187/0/hbase-site.xml) file is being referenced
 >     when launching the mapreduce utility:
 >
 >
 >      ...
 >
 >
 > 
 >
 > phoenix.schema.isNamespaceMappingEnabled
 >
 > true
 >
 > 
 >
 > 
 >
 > phoenix.schema.mapSystemTablesToNamespace
 >
 > true
 >
 > 
 >
 >
 >      ...
 >
 >   * added the appropriate classpath additions to the hadoop jar
command
 >     (zookeeper quorum hostnames changed to remove my corporate
network
 >     info as well as data directory):
 >
 >

HADOOP_CLASSPATH=/usr/hdp/3.0.1.0-187/hbase/lib/hbase-protocol.jar:/etc/hbase/3.0.1.
<http://3.0.1.>0-187/0/hbase-site.xml
 > hadoop jar
 > /usr/hdp/3.0.1.0-187/phoenix/phoenix-5.0.0.3.0.1.0-187-client.jar
 > org.apache.phoenix.mapreduce.CsvBulkLoadTool --table MYTABLE --input
 > /ingest/MYCSV -z zk1,zk2,zk3 -g
 >
 >
 > ...
 >
 >
 > 18/11/27 15:31:48 INFO zookeeper.ReadOnlyZKClient: C

Re: client does not have phoenix.schema.isNamespaceMappingEnabled

2018-11-27 Thread Josh Elser
To add a non-jar file to the classpath of a Java application, you must 
add the directory containing that file to the classpath.


Thus, the following is wrong: 
HADOOP_CLASSPATH=/usr/hdp/3.0.1.0-187/hbase/lib/hbase-protocol.jar:/etc/hbase/3.0.1.0-187/0/hbase-site.xml


And should be: 
HADOOP_CLASSPATH=/usr/hdp/3.0.1.0-187/hbase/lib/hbase-protocol.jar:/etc/hbase/3.0.1.0-187/0/


Most times, including the output of `hbase mapredcp` is sufficient ala

HADOOP_CLASSPATH="$(hbase mapredcp)" hadoop jar ...

On 11/27/18 10:48 AM, M. Aaron Bossert wrote:

Folks,

I have, I believe, followed all the directions for turning on namespace 
mapping as well as extra steps to (added classpath) required to use the 
mapreduce bulk load utility, but am still running into this error...I am 
running a Hortonworks cluster with both HDP v 3.0.1 and HDF components.  
Here is what I have tried:


  * Checked that the proper hbase-site.xml (in my case:
/etc/hbase/3.0.1.0-187/0/hbase-site.xml) file is being referenced
when launching the mapreduce utility:


     ...




phoenix.schema.isNamespaceMappingEnabled

true





phoenix.schema.mapSystemTablesToNamespace

true




     ...

  * added the appropriate classpath additions to the hadoop jar command
(zookeeper quorum hostnames changed to remove my corporate network
info as well as data directory):

HADOOP_CLASSPATH=/usr/hdp/3.0.1.0-187/hbase/lib/hbase-protocol.jar:/etc/hbase/3.0.1.0-187/0/hbase-site.xml 
hadoop jar 
/usr/hdp/3.0.1.0-187/phoenix/phoenix-5.0.0.3.0.1.0-187-client.jar 
org.apache.phoenix.mapreduce.CsvBulkLoadTool --table MYTABLE --input 
/ingest/MYCSV -z zk1,zk2,zk3 -g



...


18/11/27 15:31:48 INFO zookeeper.ReadOnlyZKClient: Close zookeeper 
connection 0x1d58d65f to master-1.punch.datareservoir.net:2181 
,master-2.punch.datareservoir.net:2181 
,master-3.punch.datareservoir.net:2181 



18/11/27 15:31:48 INFO log.QueryLoggerDisruptor: Shutting down 
QueryLoggerDisruptor..


Exception in thread "main" java.sql.SQLException: ERROR 726 
(43M10):Inconsistent namespace mapping properties. Cannot initiate 
connection as SYSTEM:CATALOG is found but client does not have 
phoenix.schema.isNamespaceMappingEnabled enabled


at 
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:494)


at 
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150)


at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1113)


at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1501)


at 
org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2740)


at 
org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:1114)


at 
org.apache.phoenix.compile.CreateTableCompiler$1.execute(CreateTableCompiler.java:192)


at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:408)


at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:391)


at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)

at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:390)


at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)


at 
org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1806)


at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2569)


at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2532)


at 
org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)


at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2532)


at 
org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)


at 
org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)


at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)

at java.sql.DriverManager.getConnection(DriverManager.java:664)

at java.sql.DriverManager.getConnection(DriverManager.java:208)

at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:400)

at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:392)

at 
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:206)


at 
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:180)


at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)

at 
org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:109)


at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at 

Re: JDBC Connection URL to Phoenix on Azure HDInsight

2018-11-27 Thread Josh Elser
Are you trying to use the thick driver (direct HBase connection) or the 
thin driver (via Phoenix Query Server)? You're providing examples of both.


If the thick driver: by default, I believe that HDI configures a root 
znode of "/hbase" not "/hbase-unsecure" which would be a problem. You 
need to pass your ZK quorum which makes your 2nd HDI "example" 
nonsensical. You can read the Phoenix home page for information about 
how to construct a Phoenix JDBC URL for the thick driver at 
https://phoenix.apache.org/#connStr


If you are still having troubles, I'd suggest that you clarify exactly 
what you want to do (rather than everything you tried) and include the 
error you ran into.


On 11/27/18 7:54 AM, Raghavendra Channarayappa wrote:

Dear all,

My current problem is fairly trivial, i guess, but I am currently stuck 
on this.


The following was the JDBC connection string for Phoenix on EMR

jdbc:phoenix:thin:url=http://ec2-12-345-678-90.ap-southeast-1.compute.amazonaws.com:8765;serialization=PROTOBUF;autocommit=true

which worked perfectly fine.

But, for Phoenix on Azure HDInsight(which uses HDP2.6.0), I am unable to 
figure out the equivalent JDBC url


The following options which *DID NOT* work:
*Azure HDInsight Ambari Dashboard URL as the server_url:* 
jdbc:phoenix:thin:url=http://randomcluterdashboard.azurehdinsight.net:8765;serialization=PROTOBUF;autocommit=true
*Phoenix Query Servers(comma-separated) as the server_url:* 
jdbc:phoenix:wn0-apache.x.tx.internal.cloudapp.net 
,wn1-apache.x.tx.internal.cloudapp.net 
,wn2-apache.x.tx.internal.cloudapp.net 
,wn3-apache.x.tx.internal.cloudapp.net:2181:/hbase-unsecure
*Zookeeper Quorum(comma-separated) as the server_url:* 
jdbc:phoenix:zk0-apache.x.tx.internal.cloudapp.net 
,zk1-apache.x.tx.internal.cloudapp.net 
,zk2-apache.x.tx.internal.cloudapp.net:2181:/hbase-unsecure


Where can I find the JDBC url in Ambari dashboard?

Thanks in advance,

Raghavendra






Re: Heap Size Recommendation

2018-11-27 Thread Josh Elser
HBASE_HEAPSIZE is just an environment variable which sets the JVM heap 
size. Your question doesn't make any sense to me.


On 11/16/18 8:44 AM, Azharuddin Shaikh wrote:

Hi All,

We want to improve the read performance of phoenix query for which we 
are trying to upgrade the HBASE_HEAPSIZE.


Currently we have 32 GB memory available on server where default 8 GB 
memory is allocated to HBASE_HEAPSIZE(Default of JVM Heap Size). We want 
to increase HBASE_HEAPSIZE to 16 GB, so do we need to increase JVM Heap 
size as well or there is no need to increase JVM Heap size.


Let us know what is the recommendation for increasing HBASE_HEAPSIZE 
with respect to JVM Heap size. We tried to refer various document but we 
have still not received the exact answer to this question.


Thank you,

Azhar



Re: Python phoenixdb adapter and JSON serialization on PQS

2018-11-09 Thread Josh Elser
-440e-405b-8185-44d1cc7f266c',
   'statementId': 1,
   'maxRowCount': -2,
   'maxRowsTotal': -2,
   'maxRowsInFirstFrame': -2,
}

“sqlline-thin.json -s JSON” works because the Avatica client driver 
Phoenix/Sqlline uses makes the right HTTP calls, with the correct args.


The response time with using JSON serialization and fetching *all* rows 
(cursor.itersize = -2) was 60s, which is much faster than the 360s that 
it takes with protobuf when passing cursor.itersize=2K (note that 
protobuf did not work with fetching all rows at once, i.e., 
cursor.itersize=-2)


Because of these issues, for now we have switched PQS to using JSON 
serialization and updated our clients to use the same. We’re obviously 
very much interested in understanding how the protobuf path can be made 
faster.


Thanks again for the help!
Manoj

On Tue, Nov 6, 2018 at 9:51 AM Josh Elser <mailto:els...@apache.org>> wrote:




On 11/5/18 10:10 PM, Manoj Ganesan wrote:
 > Thanks for the pointers Josh. I'm working on getting a
representative
 > concise test to demonstrate the issue.
 >
 > Meanwhile, I had one question regarding the following:
 >
 >     You are right that the operations in PQS should be exactly
the same,
 >     regardless of the client you're using -- that is how this
 >     architecture works.
 >
 >
 > IIUC, this means the following 2 methods should yield the same
result:
 >
 >  1. sqlline-thin.py -s JSON 
 >  2. using a python avatica client script making JSON requests

That's correct. Any client which speaks to PQS should see the same
results. There may be bugs in the client implementation, of course,
which make this statement false.

 > I made the following change in hbase-site.xml on the PQS host:
 >
 > 
 >      phoenix.queryserver.serialization
 >      JSON
 > 
 >
 > I notice that executing "sqlline-thin.py -s JSON "
returns
 > results just fine. However, when I use a simple script to try the
same
 > query, it returns 0 rows. I'm attaching the Python script here. The
 > script essentially makes HTTP calls using the Avatica JSON reference
 > <https://calcite.apache.org/avatica/docs/json_reference.html>. I
assumed
 > that the sqlline-thin wrapper (when passed the -s JSON flag) also
make
 > HTTP calls based on the JSON reference, is that not correct?

Apache mailing lists strip attachments. Please consider hosting it
somewhere else, along with instructions/scripts to generate the
required
tables. Please provide some more analysis of the problem than just a
summarization of what you see as an end-user -- I don't have the cycles
or interest to debug the entire system for you :)

Avatica is a protocol that interprets JDBC using some serialization
(JSON or Protobuf today) and a transport (only HTTP) to a remote server
to run the JDBC oeprations. So, yes: an Avatica client is always using
HTTP, given whatever serialization you instruct it to use.

 > I'll work on getting some test cases here soon to illustrate this as
 > well as the performance problem.
 >
 > Thanks again!
 > Manoj
 >
 > On Mon, Nov 5, 2018 at 10:43 AM Josh Elser mailto:els...@apache.org>
 > <mailto:els...@apache.org <mailto:els...@apache.org>>> wrote:
 >
 >     Is the OOME issue regardless of using the Java client
(sqlline-thin)
 >     and
 >     the Python client? I would like to know more about this one.
If you can
 >     share something that reproduces the problem for you, I'd like
to look
 >     into it. The only suggestion I have at this point in time is
to make
 >     sure you set a reasonable max-heap size in hbase-env.sh (e.g.
-Xmx) via
 >     PHOENIX_QUERYSERVER_OPTS and have HBASE_CONF_DIR pointing to
the right
 >     directory when you launch PQS.
 >
 >     Regarding performance, as you've described it, it sounds like the
 >     Python
 >     driver is just slower than the Java driver. You are right
that the
 >     operations in PQS should be exactly the same, regardless of
the client
 >     you're using -- that is how this architecture works. Avatica
is a wire
 >     protocol that all clients use to talk to PQS. More
digging/information
 >     you can provide about the exact circumstances (and, again,
 >     steps/environment to reproduce what you see) would be
extremely helpful.
 >
 >     Thanks Manoj.
 >
 >     - Josh
 >
 >     On 11/2/18 7:16 PM, Manoj Ganesan wrote:
 >      > Thanks Josh for the response!
 >      >
 >      > I would

Re: Python phoenixdb adapter and JSON serialization on PQS

2018-11-06 Thread Josh Elser



On 11/5/18 10:10 PM, Manoj Ganesan wrote:
Thanks for the pointers Josh. I'm working on getting a representative 
concise test to demonstrate the issue.


Meanwhile, I had one question regarding the following:

You are right that the operations in PQS should be exactly the same,
regardless of the client you're using -- that is how this
architecture works.


IIUC, this means the following 2 methods should yield the same result:

 1. sqlline-thin.py -s JSON 
 2. using a python avatica client script making JSON requests


That's correct. Any client which speaks to PQS should see the same 
results. There may be bugs in the client implementation, of course, 
which make this statement false.



I made the following change in hbase-site.xml on the PQS host:


     phoenix.queryserver.serialization
     JSON


I notice that executing "sqlline-thin.py -s JSON " returns 
results just fine. However, when I use a simple script to try the same 
query, it returns 0 rows. I'm attaching the Python script here. The 
script essentially makes HTTP calls using the Avatica JSON reference 
<https://calcite.apache.org/avatica/docs/json_reference.html>. I assumed 
that the sqlline-thin wrapper (when passed the -s JSON flag) also make 
HTTP calls based on the JSON reference, is that not correct?


Apache mailing lists strip attachments. Please consider hosting it 
somewhere else, along with instructions/scripts to generate the required 
tables. Please provide some more analysis of the problem than just a 
summarization of what you see as an end-user -- I don't have the cycles 
or interest to debug the entire system for you :)


Avatica is a protocol that interprets JDBC using some serialization 
(JSON or Protobuf today) and a transport (only HTTP) to a remote server 
to run the JDBC oeprations. So, yes: an Avatica client is always using 
HTTP, given whatever serialization you instruct it to use.


I'll work on getting some test cases here soon to illustrate this as 
well as the performance problem.


Thanks again!
Manoj

On Mon, Nov 5, 2018 at 10:43 AM Josh Elser <mailto:els...@apache.org>> wrote:


Is the OOME issue regardless of using the Java client (sqlline-thin)
and
the Python client? I would like to know more about this one. If you can
share something that reproduces the problem for you, I'd like to look
into it. The only suggestion I have at this point in time is to make
sure you set a reasonable max-heap size in hbase-env.sh (e.g. -Xmx) via
PHOENIX_QUERYSERVER_OPTS and have HBASE_CONF_DIR pointing to the right
directory when you launch PQS.

Regarding performance, as you've described it, it sounds like the
Python
driver is just slower than the Java driver. You are right that the
operations in PQS should be exactly the same, regardless of the client
you're using -- that is how this architecture works. Avatica is a wire
protocol that all clients use to talk to PQS. More digging/information
you can provide about the exact circumstances (and, again,
steps/environment to reproduce what you see) would be extremely helpful.

Thanks Manoj.

- Josh

On 11/2/18 7:16 PM, Manoj Ganesan wrote:
 > Thanks Josh for the response!
 >
 > I would definitely like to use protobuf serialization, but I'm
observing
 > performance issues trying to run queries with a large number of
results.
 > One problem is that I observe PQS runs out of memory, when its
trying to
 > (what looks like to me) serialize the results in Avatica. The
other is
 > that the phoenixdb python adapter itself spends a large amount of
time
 > in the logic
 >

<https://github.com/apache/phoenix/blob/master/python/phoenixdb/phoenixdb/cursor.py#L248>

 > where its converting the protobuf rows to python objects.
 >
 > Interestingly when we use sqlline-thin.py instead of python
phoenixdb,
 > the protobuf serialization works fine and responses are fast.
It's not
 > clear to me why PQS would have problems when using the python
adapter
 > and not when using sqlline-thin, do they follow different code paths
 > (especially around serialization)?
 >
     > Thanks again,
 > Manoj
 >
 > On Fri, Nov 2, 2018 at 4:05 PM Josh Elser mailto:els...@apache.org>
 > <mailto:els...@apache.org <mailto:els...@apache.org>>> wrote:
 >
 >     I would strongly suggest you do not use the JSON serialization.
 >
 >     The JSON support is implemented via Jackson which has no
means to make
 >     backwards compatibility "easy". On the contrast, protobuf
makes this
 >     extremely easy and we have multiple examples over the past
years where
 >     we've been able to fix bugs in a backwards compatible manner.
 >

Re: Python phoenixdb adapter and JSON serialization on PQS

2018-11-05 Thread Josh Elser
Is the OOME issue regardless of using the Java client (sqlline-thin) and 
the Python client? I would like to know more about this one. If you can 
share something that reproduces the problem for you, I'd like to look 
into it. The only suggestion I have at this point in time is to make 
sure you set a reasonable max-heap size in hbase-env.sh (e.g. -Xmx) via 
PHOENIX_QUERYSERVER_OPTS and have HBASE_CONF_DIR pointing to the right 
directory when you launch PQS.


Regarding performance, as you've described it, it sounds like the Python 
driver is just slower than the Java driver. You are right that the 
operations in PQS should be exactly the same, regardless of the client 
you're using -- that is how this architecture works. Avatica is a wire 
protocol that all clients use to talk to PQS. More digging/information 
you can provide about the exact circumstances (and, again, 
steps/environment to reproduce what you see) would be extremely helpful.


Thanks Manoj.

- Josh

On 11/2/18 7:16 PM, Manoj Ganesan wrote:

Thanks Josh for the response!

I would definitely like to use protobuf serialization, but I'm observing 
performance issues trying to run queries with a large number of results. 
One problem is that I observe PQS runs out of memory, when its trying to 
(what looks like to me) serialize the results in Avatica. The other is 
that the phoenixdb python adapter itself spends a large amount of time 
in the logic 
<https://github.com/apache/phoenix/blob/master/python/phoenixdb/phoenixdb/cursor.py#L248> 
where its converting the protobuf rows to python objects.


Interestingly when we use sqlline-thin.py instead of python phoenixdb, 
the protobuf serialization works fine and responses are fast. It's not 
clear to me why PQS would have problems when using the python adapter 
and not when using sqlline-thin, do they follow different code paths 
(especially around serialization)?


Thanks again,
Manoj

On Fri, Nov 2, 2018 at 4:05 PM Josh Elser <mailto:els...@apache.org>> wrote:


I would strongly suggest you do not use the JSON serialization.

The JSON support is implemented via Jackson which has no means to make
backwards compatibility "easy". On the contrast, protobuf makes this
extremely easy and we have multiple examples over the past years where
we've been able to fix bugs in a backwards compatible manner.

If you want the thin client to continue to work across versions, stick
with protobuf.

On 11/2/18 5:27 PM, Manoj Ganesan wrote:
 > Hey everyone,
 >
 > I'm trying to use the Python phoenixdb adapter work with JSON
 > serialization on PQS.
 >
 > I'm using Phoenix 4.14 and the adapter works fine with protobuf, but
 > when I try making it work with an older version of phoenixdb
(before the
 > JSON to protobuf switch was introduced), it just returns 0 rows.
I don't
 > see anything in particular wrong with the HTTP requests itself,
and they
 > seem to conform to the Avatica JSON spec
 > (http://calcite.apache.org/avatica/docs/json_reference.html).
 >
 > Here's the result (with some debug statements) that returns 0 rows.
 > Notice the *"firstFrame":{"offset":0,"done":true,"rows":[]* below:
 >
 > request body =  {"maxRowCount": -2, "connectionId":
 > "68c05d12-5770-47d6-b3e4-dba556db4790", "request":
"prepareAndExecute",
 > "statementId": 3, "sql": "SELECT col1, col2 from table limit 20"}
 > request headers =  {'content-type': 'application/json'}
 > _post_request: got response {'fp':  0x7f858330b9d0>, 'status': 200, 'will_close': False, 'chunk_left':
 > 'UNKNOWN', 'length': 1395, 'strict': 0, 'reason': 'OK',
'version': 11,
 > 'debuglevel': 0, 'msg':  0x7f84fb50be18>, 'chunked': 0, '_method': 'POST'}
 > response.read(): body =
 >

{"response":"executeResults","missingStatement":false,"rpcMetadata":{"response":"rpcMetadata","serverAddress":"ip-10-55-6-247:8765"},"results":[{"response":"resultSet","connectionId":"68c05d12-5770-47d6-b3e4-dba556db4790","statementId":3,"ownStatement":true,"signature":{"columns":[{"ordinal":0,"autoIncrement":false,"caseSensitive":false,"searchable":true,"currency":false,"nullable
 >

":0,"signed":true,"displaySize":40,"label":"COL1","columnName":"COL1","schemaName":"","precision":0,"scale":0,"tableName":"TABLE","catalogName":"","type"

Re: ABORTING region server and following HBase cluster "crash"

2018-11-05 Thread Josh Elser
Thanks, Neelesh. It came off to me like "Phoenix is no good, Cassandra 
has something that works better".


I appreciate you taking the time to clarify! That really means a lot.

On 11/2/18 8:14 PM, Neelesh wrote:
By no means am I judging Phoenix based on this. This is simply a design 
trade-off (scylladb goes the same route and builds global indexes). I 
appreciate all the effort that has gone in to Phoenix, and it was indeed 
a life saver. But the technical point remains that single node failures 
have potential to cascade to the entire cluster. That's the nature of 
global indexes, not specific to phoenix.


I apologize if my response came off as dismissing phoenix altogether. 
FWIW, I'm a big advocate of phoenix at my org internally, albeit for the 
newer version.



On Fri, Nov 2, 2018, 4:09 PM Josh Elser <mailto:els...@apache.org>> wrote:


I would strongly disagree with the assertion that this is some
unavoidable problem. Yes, an inverted index is a data structure which,
by design, creates a hotspot (phrased another way, this is "data
locality").

Lots of extremely smart individuals have spent a significant amount of
time and effort in stabilizing secondary indexes in the past 1-2 years,
not to mention others spending time on a local index implementation.
Judging Phoenix in its entirety based off of an arbitrarily old version
of Phoenix is disingenuous.

On 11/2/18 2:00 PM, Neelesh wrote:
 > I think this is an unavoidable problem in some sense, if global
indexes
 > are used. Essentially global indexes create a  graph of dependent
region
 > servers due to index rpc calls from one RS to another. Any single
 > failure is bound to affect the entire graph, which under
reasonable load
 > becomes the entire HBase cluster. We had to drop global indexes
just to
 > keep the cluster running for more than a few days.
 >
 > I think Cassandra has local secondary indexes preciesly because
of this
 > issue. Last I checked there were significant pending improvements
 > required for Phoenix local indexes, especially around read paths
( not
 > utilizing primary key prefixes in secondary index reads where
possible,
 > for example)
 >
 >
 > On Thu, Sep 13, 2018, 8:12 PM Jonathan Leech mailto:jonat...@gmail.com>
 > <mailto:jonat...@gmail.com <mailto:jonat...@gmail.com>>> wrote:
 >
 >     This seems similar to a failure scenario I’ve seen a couple
times. I
 >     believe after multiple restarts you got lucky and tables were
 >     brought up by Hbase in the correct order.
 >
 >     What happens is some kind of semi-catastrophic failure where 1 or
 >     more region servers go down with edits that weren’t flushed,
and are
 >     only in the WAL. These edits belong to regions whose tables have
 >     secondary indexes. Hbase wants to replay the WAL before
bringing up
 >     the region server. Phoenix wants to talk to the index region
during
 >     this, but can’t. It fails enough times then stops.
 >
 >     The more region servers / tables / indexes affected, the more
likely
 >     that a full restart will get stuck in a classic deadlock. A good
 >     old-fashioned data center outage is a great way to get
started with
 >     this kind of problem. You might make some progress and get stuck
 >     again, or restart number N might get those index regions
initialized
 >     before the main table.
 >
 >     The sure fire way to recover a cluster in this condition is to
 >     strategically disable all the tables that are failing to come up.
 >     You can do this from the Hbase shell as long as the master is
 >     running. If I remember right, it’s a pain since the disable
command
 >     will hang. You might need to disable a table, kill the shell,
 >     disable the next table, etc. Then restart. You’ll eventually
have a
 >     cluster with all the region servers finally started, and a
bunch of
 >     disabled regions. If you disabled index tables, enable one,
wait for
 >     it to become available; eg its WAL edits will be replayed, then
 >     enable the associated main table and wait for it to come
online. If
 >     Hbase did it’s job without error, and your failure didn’t include
 >     losing 4 disks at once, order will be restored. Lather, rinse,
 >     repeat until everything is enabled and online.
 >
 >      A big enough failure sprinkled with a little bit of
bad luck
 >     and what seems to be a Phoenix flaw == deadlock trying to get
HBASE
 >     to start up. Fix by forcing the order 

Re: ABORTING region server and following HBase cluster "crash"

2018-11-02 Thread Josh Elser
I would strongly disagree with the assertion that this is some 
unavoidable problem. Yes, an inverted index is a data structure which, 
by design, creates a hotspot (phrased another way, this is "data locality").


Lots of extremely smart individuals have spent a significant amount of 
time and effort in stabilizing secondary indexes in the past 1-2 years, 
not to mention others spending time on a local index implementation. 
Judging Phoenix in its entirety based off of an arbitrarily old version 
of Phoenix is disingenuous.


On 11/2/18 2:00 PM, Neelesh wrote:
I think this is an unavoidable problem in some sense, if global indexes 
are used. Essentially global indexes create a  graph of dependent region 
servers due to index rpc calls from one RS to another. Any single 
failure is bound to affect the entire graph, which under reasonable load 
becomes the entire HBase cluster. We had to drop global indexes just to 
keep the cluster running for more than a few days.


I think Cassandra has local secondary indexes preciesly because of this 
issue. Last I checked there were significant pending improvements 
required for Phoenix local indexes, especially around read paths ( not 
utilizing primary key prefixes in secondary index reads where possible, 
for example)



On Thu, Sep 13, 2018, 8:12 PM Jonathan Leech <mailto:jonat...@gmail.com>> wrote:


This seems similar to a failure scenario I’ve seen a couple times. I
believe after multiple restarts you got lucky and tables were
brought up by Hbase in the correct order.

What happens is some kind of semi-catastrophic failure where 1 or
more region servers go down with edits that weren’t flushed, and are
only in the WAL. These edits belong to regions whose tables have
secondary indexes. Hbase wants to replay the WAL before bringing up
the region server. Phoenix wants to talk to the index region during
this, but can’t. It fails enough times then stops.

The more region servers / tables / indexes affected, the more likely
that a full restart will get stuck in a classic deadlock. A good
old-fashioned data center outage is a great way to get started with
this kind of problem. You might make some progress and get stuck
again, or restart number N might get those index regions initialized
before the main table.

The sure fire way to recover a cluster in this condition is to
strategically disable all the tables that are failing to come up.
You can do this from the Hbase shell as long as the master is
running. If I remember right, it’s a pain since the disable command
will hang. You might need to disable a table, kill the shell,
disable the next table, etc. Then restart. You’ll eventually have a
cluster with all the region servers finally started, and a bunch of
disabled regions. If you disabled index tables, enable one, wait for
it to become available; eg its WAL edits will be replayed, then
enable the associated main table and wait for it to come online. If
Hbase did it’s job without error, and your failure didn’t include
losing 4 disks at once, order will be restored. Lather, rinse,
repeat until everything is enabled and online.

 A big enough failure sprinkled with a little bit of bad luck
and what seems to be a Phoenix flaw == deadlock trying to get HBASE
to start up. Fix by forcing the order that Hbase brings regions
online. Finally, never go full restart. 

 > On Sep 10, 2018, at 7:30 PM, Batyrshin Alexander
<0x62...@gmail.com <mailto:0x62...@gmail.com>> wrote:
 >
 > After update web interface at Master show that every region
server now 1.4.7 and no RITS.
 >
 > Cluster recovered only when we restart all regions servers 4 times...
 >
 >> On 11 Sep 2018, at 04:08, Josh Elser mailto:els...@apache.org>> wrote:
 >>
 >> Did you update the HBase jars on all RegionServers?
 >>
 >> Make sure that you have all of the Regions assigned (no RITs).
There could be a pretty simple explanation as to why the index can't
be written to.
 >>
 >>> On 9/9/18 3:46 PM, Batyrshin Alexander wrote:
 >>> Correct me if im wrong.
 >>> But looks like if you have A and B region server that has index
and primary table then possible situation like this.
 >>> A and B under writes on table with indexes
 >>> A - crash
 >>> B failed on index update because A is not operating then B
starting aborting
 >>> A after restart try to rebuild index from WAL but B at this
time is aborting then A starting aborting too
 >>> From this moment nothing happens (0 requests to region servers)
and A and B is not responsible from Master-status web interface
 >>>> On 9 Sep 2018, at 04:38, Batyrshin Alexander
  

Re: Python phoenixdb adapter and JSON serialization on PQS

2018-11-02 Thread Josh Elser

I would strongly suggest you do not use the JSON serialization.

The JSON support is implemented via Jackson which has no means to make 
backwards compatibility "easy". On the contrast, protobuf makes this 
extremely easy and we have multiple examples over the past years where 
we've been able to fix bugs in a backwards compatible manner.


If you want the thin client to continue to work across versions, stick 
with protobuf.


On 11/2/18 5:27 PM, Manoj Ganesan wrote:

Hey everyone,

I'm trying to use the Python phoenixdb adapter work with JSON 
serialization on PQS.


I'm using Phoenix 4.14 and the adapter works fine with protobuf, but 
when I try making it work with an older version of phoenixdb (before the 
JSON to protobuf switch was introduced), it just returns 0 rows. I don't 
see anything in particular wrong with the HTTP requests itself, and they 
seem to conform to the Avatica JSON spec 
(http://calcite.apache.org/avatica/docs/json_reference.html).


Here's the result (with some debug statements) that returns 0 rows. 
Notice the *"firstFrame":{"offset":0,"done":true,"rows":[]* below:


request body =  {"maxRowCount": -2, "connectionId": 
"68c05d12-5770-47d6-b3e4-dba556db4790", "request": "prepareAndExecute", 
"statementId": 3, "sql": "SELECT col1, col2 from table limit 20"}

request headers =  {'content-type': 'application/json'}
_post_request: got response {'fp': 0x7f858330b9d0>, 'status': 200, 'will_close': False, 'chunk_left': 
'UNKNOWN', 'length': 1395, 'strict': 0, 'reason': 'OK', 'version': 11, 
'debuglevel': 0, 'msg': 0x7f84fb50be18>, 'chunked': 0, '_method': 'POST'}
response.read(): body =  
{"response":"executeResults","missingStatement":false,"rpcMetadata":{"response":"rpcMetadata","serverAddress":"ip-10-55-6-247:8765"},"results":[{"response":"resultSet","connectionId":"68c05d12-5770-47d6-b3e4-dba556db4790","statementId":3,"ownStatement":true,"signature":{"columns":[{"ordinal":0,"autoIncrement":false,"caseSensitive":false,"searchable":true,"currency":false,"nullable
":0,"signed":true,"displaySize":40,"label":"COL1","columnName":"COL1","schemaName":"","precision":0,"scale":0,"tableName":"TABLE","catalogName":"","type":{"type":"scalar","id":4,"name":"INTEGER","rep":"PRIMITIVE_INT"},"readOnly":true,"writable":false,"definitelyWritable":false,"columnClassName":"java.lang.Integer"},{"ordinal":1,"autoIncrement":false,"caseSensitive":false,"searchable":true,"currency":false,"nullable":0,"signed":true,"displaySize":40,"label":"COL2","columnName":"COL2","schemaName":"","precision":0,"scale":0,"tableName":"TABLE","catalogName":"","type":{"type":"scalar","id":4,"name":"INTEGER","rep":"PRIMITIVE_INT"},"readOnly":true,"writable":false,"definitelyWritable":false,"columnClassName":"java.lang.Integer"}],"sql":null,"parameters":[],"cursorFactory":{"style":"LIST","clazz":null,"fieldNames":null},"statementType":null},*"firstFrame":{"offset":0,"done":true,"rows":[]*},"updateCount":-1,"rpcMetadata":{"response":"rpcMetadata","serverAddress":"ip-10-55-6-247:8765"}}]} 



The same query issued against a PQS started with PROTOBUF serialization 
and using a newer phoenixdb adapter returns the correct number of rows.


Has anyone had luck making this work?

Thanks,
Manoj



Re: Phoenix Performances & Uses Cases

2018-10-29 Thread Josh Elser
Specifically to your last two points about windowing, transforming, 
grouping, etc: my current opinion is that Hive does certain analytical 
style operations much better than Phoenix. Personally, I don't think it 
makes sense for Phoenix to try to "catch up". It would take years for us 
to build such capabilities on par with what they have.


Some of us have been making efforts to ease data access between Hive and 
Phoenix via the PhoenixStorageHandler for Hive. The goal of this is that 
it will make your life easier to use the correct tool for the job. Use 
Hive when Hive does things well, and use Phoenix when Phoenix does it well.


(Again, this is my opinion. It is not meant to be some declaration of 
direction by the entire Apache Phoenix community)


On 10/27/18 7:50 AM, Nicolas Paris wrote:

Hi

I am benchmarking phoenix to better understand its strength and
weaknesses. My basis is to compare to postgresql for OLTP workload and
hive llap for OLAP workload. I am testing on a 10 computer cluster
instance with hive (2.1) and phoenix (4.8)  220 GO RAM/32CPU versus a
postgresql (9.6) 128GO RAM 32CPU.

Right now, my opinion is:
- when getting a subset on a large table, phoenix performs the
   best
- when getting a subset from multiple large tables, postgres performs
   the best
- when getting a subset from a large table joining one to many small
   table, phoenix performs the best
- when ingesting high frequency data, Phoenix performs the best
- when grouping by query, hive > postgresql > phoenix
- when windowning, transforming, grouping, hive performs the best,
   phoenix the worst

Finally, my conclusion is  phoenix is not intended at all for analytics
queries such grouping, windowing, and joining large tables. It suits
well for very specific use case like maintaining a very large table with
eventually small tables to join with (such timeseries data, or binary
storage data with hbase MOB enabled).

Am I missing something ?

Thanks,



Re: Phoenix metrics error on thin client

2018-10-23 Thread Josh Elser
The thick client talks directly to HBase. The thin client talks to PQS. 
You cannot mix-and-match.


Glad to hear you got it working. How can the documentation be improved 
to make this more clear?


On 10/23/18 9:11 PM, Monil Gandhi wrote:

Hello
Update. I was able to figure this out. Thanks for the initial pointer :)
On Mon, Oct 22, 2018 at 10:54 PM Monil Gandhi <mailto:mgand...@gmail.com>> wrote:


Hello,
Thanks for the earlier reply.
I am a little confused with documentation and the response from
Josh. This may be my limited knowledge with Phoenix.

Can I connect to the server with thick client instead of thin for my
server which is running thing client since thick client seems to be
running on PQS?

Additionally I followed the directions on
https://phoenix.apache.org/server.html and the above linked
conversation, but I am unable to generate any kind of metrics for
any queries. Please note that in this scenario I am running my
queries via the think client installed on PQS

For a particular query, I am trying to see how many rows are being
scanned and across how many region servers. If there is an easier
way, please let me know

On Thu, Oct 18, 2018 at 7:00 PM Monil Gandhi mailto:mgand...@gmail.com>> wrote:

Okay. Will take a look. Thanks
On Wed, Oct 17, 2018 at 8:28 AM Josh Elser mailto:els...@apache.org>> wrote:

The methods that you are invoking assume that the Phoenix
JDBC driver
(the java class org.apache.phoenix.jdbc.PhoenixDriver) is in
use. It's
not, so you get this error.

The Phoenix "thick" JDBC driver is what's running inside of
the Phoenix
Query Server, just not in your local JVM. As such, you need
to look at
PQS for metrics.

You probably want to look at what was done in
https://issues.apache.org/jira/browse/PHOENIX-3655.

On 10/16/18 2:49 PM, Monil Gandhi wrote:
 > Hello,
 > I am trying to collect some metrics on certain queries.
Here is the code
 > that I have
 >
 > Properties props =new Properties();
 >
props.setProperty(QueryServices.COLLECT_REQUEST_LEVEL_METRICS,
"true");
 > props.setProperty("phoenix.trace.frequency", "always");
 >
 > try (Connection conn = DriverManager.getConnection(url,
props)) {
 >      conn.setAutoCommit(true);
 >
 > PreparedStatement stmt = conn.prepareStatement(query);
 >
 > Map
overAllQueryMetrics =null;
 > Map> requestReadMetrics =null;
 > try (ResultSet rs = stmt.executeQuery()) {
 >          rs.next();
 > requestReadMetrics =
PhoenixRuntime.getRequestReadMetricInfo(rs);
 > // log or report metrics as needed
 > PhoenixRuntime.resetMetrics(rs);
 > rs.close();
 > }
 > }
 >
 >
 > However, rs.next() throws the following error
 > java.sql.SQLException: does not implement 'class
 > org.apache.phoenix.jdbc.PhoenixResultSet'
 >
 > I am not sure why the error is happening. Are metrics not
supported with
 > thin client?
 >
 > If not how do I get query level metrics?
 >
 > Thanks



Re: Connection Pooling?

2018-10-18 Thread Josh Elser
Batyrshin, you asked about statement caching which is different than 
connection pooling.


@JMS, yes, the FAQ is accurate (as is the majority of the rest of the 
documentation ;))


On 10/18/18 1:14 PM, Batyrshin Alexander wrote:
I've already asked the same question in this thread - 
http://apache-phoenix-user-list.1124778.n5.nabble.com/Statements-caching-td4674.html


On 18 Oct 2018, at 19:44, Jean-Marc Spaggiari > wrote:


Hi,

Is this statement in the FAQ still valid?

"If Phoenix Connections are reused, it is possible that the underlying 
HBase connection is not always left in a healthy state by the previous 
user. It is better to create new Phoenix Connections to ensure that 
you avoid any potential issues."
https://phoenix.apache.org/faq.html#Should_I_pool_Phoenix_JDBC_Connections 



Thanks,

JMS




Re: Phoenix metrics error on thin client

2018-10-17 Thread Josh Elser
The methods that you are invoking assume that the Phoenix JDBC driver 
(the java class org.apache.phoenix.jdbc.PhoenixDriver) is in use. It's 
not, so you get this error.


The Phoenix "thick" JDBC driver is what's running inside of the Phoenix 
Query Server, just not in your local JVM. As such, you need to look at 
PQS for metrics.


You probably want to look at what was done in 
https://issues.apache.org/jira/browse/PHOENIX-3655.


On 10/16/18 2:49 PM, Monil Gandhi wrote:

Hello,
I am trying to collect some metrics on certain queries. Here is the code 
that I have


Properties props =new Properties();
props.setProperty(QueryServices.COLLECT_REQUEST_LEVEL_METRICS, "true");
props.setProperty("phoenix.trace.frequency", "always");

try (Connection conn = DriverManager.getConnection(url, props)) {
 conn.setAutoCommit(true);

PreparedStatement stmt = conn.prepareStatement(query);

Map overAllQueryMetrics =null;
Map> 
requestReadMetrics =null;
try (ResultSet rs = stmt.executeQuery()) {
 rs.next();
requestReadMetrics = PhoenixRuntime.getRequestReadMetricInfo(rs);
// log or report metrics as needed
PhoenixRuntime.resetMetrics(rs);
rs.close();
}
}


However, rs.next() throws the following error
java.sql.SQLException: does not implement 'class 
org.apache.phoenix.jdbc.PhoenixResultSet'


I am not sure why the error is happening. Are metrics not supported with 
thin client?


If not how do I get query level metrics?

Thanks


Re: ON DUPLICATE KEY with Global Index

2018-10-09 Thread Josh Elser
Can you elaborate on what is unclear about the documentation? This 
exception and the related documentation read as being in support of each 
other to me.


On 10/9/18 5:39 AM, Batyrshin Alexander wrote:

  Hello all,
Documentations (http://phoenix.apache.org/atomic_upsert.html) say:

"Although global indexes on columns being atomically updated are supported, it’s not 
recommended as a potentially a separate RPC across the wire would be made while the row 
is under lock to maintain the secondary index."

But in practice we get:
CANNOT_USE_ON_DUP_KEY_WITH_GLOBAL_IDX(1224, "42Z24", "The ON DUPLICATE KEY clause 
may not be used when a table has a global index." )

Is this bug or documentation is outdated?



Re: Specifying HBase cell visibility labels or running as a particular user

2018-10-08 Thread Josh Elser

Hey Mike,

You can definitely authenticate yourself as with the Kerberos 
credentials of your choice. There are generally two ways in you can do this:


1. Login using UserGroupInformation APIs and then make JDBC calls with 
the Phoenix JDBC driver (thick or thin)
2. Use the principal+keytab JDBC url "options" and let Phoenix do it for 
you.


These have had some issues around them in the past, but, if you're using 
a recent release, you should be fine.


I don't believe we have any integration with HBase visibility labels, 
and I think this would be extremely tricky to get correct (Phoenix does 
a significant amount of reads on your behalf for a query via 
coprocessors. You'd have to update each of these to pass through and set 
the labels everywhere).


On 10/8/18 4:36 PM, Mike Thomsen wrote:
We have a particular use case where we'd like to be able to effectively 
do a SELECT on a table and say either "execute as this user" or "execute 
with this list of HBase visibility tokens."


This looks somewhat promising for the former:

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_installing_manually_book/content/validating-phoenix-installation.html

It looks like we could at least allow some of our users to have a 
kerberos tab set up for them.


Any thoughts on how to approach this? I know it may be uncharted 
territory for Phoenix and don't mind trying to get my hands dirty on 
working on a PR or something.


Thanks,

Mike



Re: Table dead lock: ERROR 1120 (XCL20): Writes to table blocked until index can be updated

2018-10-02 Thread Josh Elser
HBase will invalidate the location of a Region on seeing certain 
exceptions (including NotServingRegionException). After it sees the 
exception you have copied below, it should re-fetch the location of the 
Region.


If HBase keeps trying to access a Region on a RS that isn't hosting it, 
either hbase:meta is wrong or the HBase client has a bug.


However, to the point here, if that region was split successfully, 
clients should not be reading from that region anymore -- they would 
read from the daughters of that split region.


On 10/2/18 2:34 PM, Batyrshin Alexander wrote:
We tried branch 4.14-HBase-1.4 at commit 
https://github.com/apache/phoenix/commit/52893c240e4f24e2bfac0834d35205f866c16ed8


Is there any way to invalidate meta-cache on event of index regions 
split? Maybe there is some option to set max time to live for cache?


Watching this on regions servers:

At 09:34 regions *96c3ede1c40c98959e60bd6fc0e07269* split on prod019

Oct 02 09:34:39 prod019 hbase[152127]: 2018-10-02 09:34:39,719 INFO 
  [regionserver/prod019/10.0.0.19:60020-splits-1538462079117] 
regionserver.SplitRequest: Region split, hbase:meta updated, and report 
to master. Parent=IDX_MARK_O,\x0B\x46200020qC8kovh\x00\x01\x80\x00\x
01e\x89\x8B\x99@\x00\x00\x00\x00,1537400033958.*96c3ede1c40c98959e60bd6fc0e07269*., 
new regions: 
IDX_MARK_O,\x0B\x46200020qC8kovh\x00\x01\x80\x00\x01e\x89\x8B\x99@\x00\x00\x00\x00,1538462079161.80fc2516619d8665789b0c5a2bca8a8b., 
IDX_MARK_O,\x0BON_SCHFDOPPR_2AL-5602
2B7D-2F90-4AA5-8125-4F4001B5BE0D-0_2AL-C0D76C01-EE7E-496B-BCD6-F6488956F75A-0_20180228_7E372181-F23D-4EBE-9CAD-5F5218C9798I\x46186195_5.UHQ=\x00\x02\x80\x00\x01a\xD3\xEA@\x80\x00\x00\x00\x00,1538462079161.24b6675d9e51067a21e58f294a9f816b.. 
Split took 0sec


Fail at 11:51 prod018

Oct 02 11:51:13 prod018 hbase[108476]: 2018-10-02 11:51:13,752 WARN 
  [hconnection-0x4131af19-shared--pool24-t26652] client.AsyncProcess: 
#164, table=IDX_MARK_O, attempt=1/1 failed=1ops, last exception: 
org.apache.hadoop.hbase.NotServingRegionException: 
org.apache.hadoop.hbase.NotServingRegionException: 
Region IDX_MARK_O,\x0B\x46200020qC8kovh\x00\x01\x80\x00\x01e\x89\x8B\x99@\x00\x00\x00\x00,1537400033958.*96c3ede1c40c98959e60bd6fc0e07269*. 
is not online on prod019,60020,1538417663874


Fail at 13:38 on prod005

Oct 02 13:38:06 prod005 hbase[197079]: 2018-10-02 13:38:06,040 WARN 
  [hconnection-0x5e744e65-shared--pool8-t31214] client.AsyncProcess: 
#53, table=IDX_MARK_O, attempt=1/1 failed=11ops, last exception: 
org.apache.hadoop.hbase.NotServingRegionException: 
org.apache.hadoop.hbase.NotServingRegionException: 
Region IDX_MARK_O,\x0B\x46200020qC8kovh\x00\x01\x80\x00\x01e\x89\x8B\x99@\x00\x00\x00\x00,1537400033958.*96c3ede1c40c98959e60bd6fc0e07269*. 
is not online on prod019,60020,1538417663874


On 27 Sep 2018, at 01:04, Ankit Singhal > wrote:


You might be hitting PHOENIX-4785 
,  you can apply the 
patch on top of 4.14 and see if it fixes your problem.


Regards,
Ankit Singhal

On Wed, Sep 26, 2018 at 2:33 PM Batyrshin Alexander <0x62...@gmail.com 
> wrote:


Any advices? Helps?
I can reproduce problem and capture more logs if needed.


On 21 Sep 2018, at 02:13, Batyrshin Alexander <0x62...@gmail.com
> wrote:

Looks like lock goes away 30 minutes after index region split.
So i can assume that this issue comes from cache that configured
by this option:*phoenix.coprocessor.maxMetaDataCacheTimeToLiveMs*




On 21 Sep 2018, at 00:15, Batyrshin Alexander <0x62...@gmail.com
> wrote:

And how this split looks at Master logs:

Sep 20 19:45:04 prod001 hbase[10838]: 2018-09-20 19:45:04,888
INFO  [AM.ZK.Worker-pool5-t282] master.RegionStates: Transition
{3e44b85ddf407da831dbb9a871496986 state=OPEN,
ts=1537304859509, server=prod013,60020,1537304282885} to
{3e44b85ddf407da831dbb9a871496986 state=SPLITTING,
ts=1537461904888, server=prod
Sep 20 19:45:05 prod001 hbase[10838]: 2018-09-20 19:45:05,340
INFO  [AM.ZK.Worker-pool5-t284] master.RegionStates: Transition
{3e44b85ddf407da831dbb9a871496986
state=SPLITTING, ts=1537461905340,
server=prod013,60020,1537304282885} to
{3e44b85ddf407da831dbb9a871496986 state=SPLIT, ts=1537461905340,
server=pro
Sep 20 19:45:05 prod001 hbase[10838]: 2018-09-20 19:45:05,340
INFO  [AM.ZK.Worker-pool5-t284] master.RegionStates: Offlined
3e44b85ddf407da831dbb9a871496986 from prod013,60020,1537304282885
Sep 20 19:45:05 prod001 hbase[10838]: 2018-09-20 19:45:05,341
INFO  [AM.ZK.Worker-pool5-t284] master.RegionStates: Transition
{33cba925c7acb347ac3f5e70e839c3cb
state=SPLITTING_NEW, ts=1537461905340,
server=prod013,60020,1537304282885} to
{33cba925c7acb347ac3f5e70e839c3cb state=OPEN, ts=1537461905341,

Re: org.apache.phoenix.shaded.org.apache.thrift.TException: Unable to discover transaction service. -> TException: Unable to discover transaction service.

2018-09-26 Thread Josh Elser

If you're using HBase with Hadoop3, HBase should have Hadoop3 jars.

Re-build HBase using the -Dhadoop.profile=3.0 (I think it is) CLI option.

On 9/26/18 7:21 AM, Francis Chuang wrote:
Upon further investigation, it appears that this is because 
org.apache.hadoop.security.authentication.util.KerberosUtil.hasKerberosKeyTab 
is only available in Hadoop 2.8+. HBase ships with Hadoop 2.7.4 jars.


I noticed that Hadoop was bumped from 2.7.4 to 3.0.0 a few months ago to 
fix PQS/Avatica issues: 
https://github.com/apache/phoenix/blame/master/pom.xml#L70


I think this causes Phoenix to expect some things that are available in 
Hadoop 3.0.0, but are not present in HBase's Hadoop 2.7.4 jars.


I think I can try and replace the hadoop-*.jar files in hbase/lib with 
the equivalent 2.8.5 versions, however I am not familiar with Java and 
the hadoop project, so I am not sure if this is going to introduce issues.


On 26/09/2018 4:44 PM, Francis Chuang wrote:

I wonder if this is because:
- HBase's binary distribution ships with Hadoop 2.7.4 jars.
- Phoenix 5.0.0 has Hadoop 3.0.0 declared in its pom.xml: 
https://github.com/apache/phoenix/blob/8a819c6c3b4befce190c6ac759f744df511de61d/pom.xml#L70 

- Tephra has Hadoop 2.2.0 declared in its pom.xml: 
https://github.com/apache/incubator-tephra/blob/master/pom.xml#L211


On 26/09/2018 4:03 PM, Francis Chuang wrote:

Hi all,

I am using Phoenix 5.0.0 with HBase 2.0.0. I am seeing errors while 
trying to create transactional tables using Phoenix.


I am using my Phoenix + HBase all in one docker image available here: 
https://github.com/Boostport/hbase-phoenix-all-in-one


This is the error: 
org.apache.phoenix.shaded.org.apache.thrift.TException: Unable to 
discover transaction service. -> TException: Unable to discover 
transaction service.


I checked the tephra logs and got the following:

Exception in thread "HDFSTransactionStateStorage STARTING" Exception 
in thread "ThriftRPCServer" 
com.google.common.util.concurrent.ExecutionError: 
java.lang.NoSuchMethodError: 
org.apache.hadoop.security.authentication.util.KerberosUtil.hasKerberosKeyTab(Ljavax/security/auth/Subject;)Z 

    at 
com.google.common.util.concurrent.Futures.wrapAndThrowUnchecked(Futures.java:1008) 

    at 
com.google.common.util.concurrent.Futures.getUnchecked(Futures.java:1001) 

    at 
com.google.common.util.concurrent.AbstractService.startAndWait(AbstractService.java:220) 

    at 
com.google.common.util.concurrent.AbstractIdleService.startAndWait(AbstractIdleService.java:106) 

    at 
org.apache.tephra.TransactionManager.doStart(TransactionManager.java:245) 

    at 
com.google.common.util.concurrent.AbstractService.start(AbstractService.java:170) 

    at 
com.google.common.util.concurrent.AbstractService.startAndWait(AbstractService.java:220) 

    at 
org.apache.tephra.distributed.TransactionServiceThriftHandler.init(TransactionServiceThriftHandler.java:249) 

    at 
org.apache.tephra.rpc.ThriftRPCServer.startUp(ThriftRPCServer.java:177)
    at 
com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:47) 


    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchMethodError: 
org.apache.hadoop.security.authentication.util.KerberosUtil.hasKerberosKeyTab(Ljavax/security/auth/Subject;)Z 

    at 
org.apache.hadoop.security.UserGroupInformation.(UserGroupInformation.java:715) 

    at 
org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:925) 

    at 
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:873) 

    at 
org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:740) 

    at 
org.apache.hadoop.fs.FileSystem$Cache$Key.(FileSystem.java:3472)
    at 
org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:3310)

    at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:529)
    at 
org.apache.tephra.persist.HDFSTransactionStateStorage.startUp(HDFSTransactionStateStorage.java:104) 

    at 
com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43) 


    ... 1 more
2018-09-26 04:31:11,290 INFO [leader-election-tx.service-leader] 
distributed.TransactionService (TransactionService.java:leader(115)) 
- Transaction Thrift Service didn't start on /0.0.0.0:15165
java.lang.NoSuchMethodError: 
org.apache.hadoop.security.authentication.util.KerberosUtil.hasKerberosKeyTab(Ljavax/security/auth/Subject;)Z 

    at 
org.apache.hadoop.security.UserGroupInformation.(UserGroupInformation.java:715) 

    at 
org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:925) 

    at 
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:873) 

    at 
org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:740) 

    at 

Re: Phoenix 5.0 could not commit transaction: org.apache.phoenix.execute.CommitException: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: org.apache.phoenix.hbase

2018-09-25 Thread Josh Elser
Your assumptions are not unreasonable :) Phoenix 5.0.x should certainly 
work with HBase 2.0.x. Glad to see that it's been corrected already 
(embarassing that I don't even remember reviewing this).


Let me start a thread on dev@phoenix about a 5.0.1 or a 5.1.0. We need 
to have a Phoenix 5.x that works with all of HBase 2.0.x (and hopefully 
2.1.x too).



On 9/25/18 9:25 PM, Francis Chuang wrote:
After some investigation, I found that Phoenix 5.0.0 is only compatible 
with HBase 2.0.0.


In 2.0.1 and onward, compare(final Cell a, final Cell b) in 
CellComparatorImpl was changed to final: 
https://github.com/apache/hbase/blame/master/hbase-common/src/main/java/org/apache/hadoop/hbase/CellComparatorImpl.java#L67


This change affected HBase 2.0.1 and 2.0.2.

As Phoenix 5.0.0 relies on this behavior: 
https://github.com/apache/phoenix/blob/8a819c6c3b4befce190c6ac759f744df511de61d/phoenix-core/src/main/java/org/apache/phoenix/hbase/index/covered/data/IndexMemStore.java#L84


Fortunately, this is fixed in Phoenix master: 
https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/hbase/index/covered/data/IndexMemStore.java#L82


The issue should be resolved in the next release of Phoenix.

The problem is that I wrongly assumed HBase's version numbers to be 
following semver and that a patch release would not introduce breaking 
changes.


On 26/09/2018 1:04 AM, Jaanai Zhang wrote:



Is my method of installing HBase and Phoenix correct?

Did you check versions of HBase that exists in your classpath?

Is this a compatibility issue with Guava?

It isn't an exception which incompatible with Guava


   Jaanai Zhang
   Best regards!



Francis Chuang > 于2018年9月25日周二 下午8:25写道:


Thanks for taking a look, Jaanai!

Is my method of installing HBase and Phoenix correct? See

https://github.com/Boostport/hbase-phoenix-all-in-one/blob/master/Dockerfile#L12

Is this a compatibility issue with Guava?

Francis

On 25/09/2018 10:21 PM, Jaanai Zhang wrote:


org.apache.phoenix.hbase.index.covered.data.IndexMemStore$1
overrides
final method
compare.(Lorg/apache/hadoop/hbase/Cell;Lorg/apache/hadoop/hbase/Cell;)I
     at java.lang.ClassLoader.defineClass1(Native Method)
     at
java.lang.ClassLoader.defineClass(ClassLoader.java:763)
     at

It looks like that HBase's Jars are incompatible.


   Jaanai Zhang
   Best regards!



Francis Chuang mailto:francischu...@apache.org>> 于2018年9月25日周二 下午8:06写道:

Hi All,

I recently updated one of my Go apps to use Phoenix 5.0 with
HBase
2.0.2. I am using my Phoenix + HBase all in one docker image
available
here: https://github.com/Boostport/hbase-phoenix-all-in-one

This is the log/output from the exception:

RuntimeException: org.apache.phoenix.execute.CommitException:
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:

Failed 1 action:
org.apache.phoenix.hbase.index.builder.IndexBuildingFailureException:

Failed to build index for unexpected reason!
     at

org.apache.phoenix.hbase.index.util.IndexManagementUtil.rethrowIndexingException(IndexManagementUtil.java:206)
     at
org.apache.phoenix.hbase.index.Indexer.preBatchMutate(Indexer.java:351)
     at

org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$28.call(RegionCoprocessorHost.java:1010)
     at

org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$28.call(RegionCoprocessorHost.java:1007)
     at

org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
     at

org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
     at

org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preBatchMutate(RegionCoprocessorHost.java:1007)
     at

org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.prepareMiniBatchOperations(HRegion.java:3487)
     at

org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:3896)
     at

org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3854)
     at

org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3785)
     at

org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1027)
     at

org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:959)
  

  1   2   3   4   >