Re: Optimisation on join in case of all the data to be joined present in the same machine (region server)

2018-04-16 Thread Josh Elser
s-. As far as you're <100M rows per stream and have a few GB of disk space per processing node available it should be doable. On Mon, 16 Apr 2018, 18:49 Rabin Banerjee, <mailto:dev.rabin.baner...@gmail.com>> wrote: Thanks Josh ! On Mon, Apr 16, 2018 at 11:16 PM, Josh E

PhoenixCon 2018 CFP extended until Friday

2018-04-17 Thread Josh Elser
We've received some requests to extend the CFP a few more days. The new day of closing will be this Friday 2018/04/20, end of day. Please keep them coming in! On 4/15/18 9:25 PM, Josh Elser wrote: The PhoenixCon 2018 call for proposals is scheduled to close Monday, April 16th. If you ha

Re: pheonix client

2018-04-19 Thread Josh Elser
This question is better asked on the Phoenix users list. The phoenix-client.jar is the one you need and is unique from the phoenix-core jar. Logging frameworks are likely not easily relocated/shaded to avoid issues which is why you're running into this. Can you provide the error you're seeing

Re: 答复: phoenix query server java.lang.ClassCastException for BIGINT ARRAY column

2018-04-25 Thread Josh Elser
As a general statement, the protobuf serialization is much better maintained and comes with a degree of backwards compatibility (which the JSON serialization guarantees none). Thanks for sharing the solution. On 4/20/18 9:53 AM, Lu Wei wrote: I did some digging, and the reason is because I sta

Re: Bind Params with Union throw AvaticaSqlException

2018-04-25 Thread Josh Elser
Thanks for that, Lew. I'm snowed under this week, but will try to dig into this some more given the information you provided. On 4/20/18 4:26 PM, Lew Jackman wrote: We have a bit more of a stack trace for our bind parameter exception, not sure if this is very revealing but we have: java.lang.Ru

Re: Decrease HTTP chattiness?

2018-05-23 Thread Josh Elser
Yeah, as Francis says, this should already be exposed via the expected JDBC APIs. Kevin -- can you share more details about what version(s) you're running? A sample program? If you're running a new enough version, you can set the following log level via Log4j org.apache.calcite.avatica.rem

Re: Phoenix ODBC driver limitations

2018-05-23 Thread Josh Elser
I'd be surprised to hear that the ODBC driver would need to know anything about namespace-mapping. Do you have an error? Steps to reproduce an issue which you see? The reason I am surprised is that namespace mapping is an implementation detail of the JDBC driver which lives inside of PQS -- *n

Re: Cannot access from jdbc

2018-05-23 Thread Josh Elser
Try enabling DEBUG logging for HBase and take a look at the RegionServer log identified by the hostname in the log message. Most of the time when you see this error, it's a result of HBase rejecting the incoming request for a Kerberos authentication issue. On 5/23/18 12:10 PM, Nicolas Paris w

Re: Phoenix ODBC driver limitations

2018-05-25 Thread Josh Elser
rstood that the "client" in this case is queryserver but not ODBC driver. And now I need to check why queryserver doesn't apply this property. -Original Message- From: Josh Elser [mailto:els...@apache.org] Sent: Wednesday, May 23, 2018 6:52 PM To: user@phoenix.apache.org Subj

Re: SQL with join are running slow

2018-06-07 Thread Josh Elser
That sounds like the implementation of a HashJoin. You would want to make sure your smaller relation is serialized for this HashJoin, not the larger one. Phoenix also supports a sort-merge join which may be more better performing when you read a large percentage of data for both relations. O

Re: Problem starting region server with Hbase version hbase-2.0.0

2018-06-08 Thread Josh Elser
You shouldn't be putting the phoenix-client.jar on the HBase server classpath. There is specifically the phoenix-server.jar which is specifically built to be included in HBase (to avoid issues such as these). Please remove all phoenix-client jars and provide the phoenix-5.0.0-server jar inst

Re: Apache Phoenix 4.14 (all 4.9+ also) not supporting hbase.1.1.2

2018-06-15 Thread Josh Elser
Please reach out to Hortonworks for more information about supported versions of Phoenix with HDP. On 6/15/18 6:51 AM, rahuledavalath1 wrote: Hi All We are using hortonworks latest* hdp stack 2.6.5*. There the hbase version is* hbase.1.1.2*. We downloaded *apache-phoenix-4.14.0-HBase-1.1* and

Re: [DISCUSS] Docker images for Phoenix

2018-06-25 Thread Josh Elser
Moving this over to the dev list since this is a thing for developers to make the call on. Would ask users who have interest to comment over there as well :) I think having a "one-button" Phoenix environment is a big win, especially for folks who want to do one-off testing with a specific vers

Re: Upsert is EXTREMELY slow

2018-07-11 Thread Josh Elser
The explain plan for your tables isn't a substitute for the DDLs. Please provide those. How about sharing your completely hbase-site.xml and hbase-env.sh files, rather than just snippets like you have. A full picture is often needed. Given that HBase cannot directly run on S3, please also des

Re: Upsert is EXTREMELY slow

2018-07-11 Thread Josh Elser
Some thoughts: * Please _remove_ commented lines before sharing configuration next time. We don't need to see all of the things you don't have set :) * 100 salt buckets is really excessive for a 4 node cluster. Salt buckets are not synonymous with pre-splitting HBase tables. This many salt b

Re: Upsert is EXTREMELY slow

2018-07-11 Thread Josh Elser
Your real-world situation is not a single-threaded application, is it? You will have multiple threads which are all updating Phoenix concurrently. Given the semantics that your application needs from the requirements you stated, I'm not sure what else you can do differently. You can get low-la

Re: Upsert is EXTREMELY slow

2018-07-12 Thread Josh Elser
Phoenix does not recommend connection pooling because Phoenix Connections are not expensive to create as most DB connections are. The first connection you make from a JVM is expensive. Every subsequent one is cheap. On 7/11/18 2:55 PM, alchemist wrote: Since Phoenix does not recommend connec

Re: Upsert is EXTREMELY slow

2018-07-12 Thread Josh Elser
cs of batching which you are completely missing out on. There are multiple manifestations of this. Row-locks are just one (network overhead, serialization, and rpc scheduling/execution are three others I can easily see) On 7/11/18 4:10 PM, alchemist wrote: Josh Elser-2 wrote Josh thanks so much fo

Re: Upsert is EXTREMELY slow

2018-07-13 Thread Josh Elser
Also, they're relying on Phoenix to do secondary index updates for them. Obviously, you can do this faster than Phoenix can if you know the exact use-case. On 7/12/18 6:31 PM, Pedro Boado wrote: A tip for performance is reusing the same preparedStatement , just clearParameters() , set values

Re: Upsert is EXTREMELY slow

2018-07-13 Thread Josh Elser
’t be slower to update secondary indexes than a use case would be. Both have to do the writes to a second table to keep it in sync. On Fri, Jul 13, 2018 at 8:39 AM Josh Elser <mailto:els...@apache.org>> wrote: Also, they're relying on Phoenix to do secondary index updates for t

Re: Phoenix 5 connection URL issue

2018-07-23 Thread Josh Elser
Use the absolute path to your keytab, not the tilde character to refer to your current user's home directory. On 7/23/18 1:11 AM, Sumanta Gh wrote: Hi, I am trying to connect a Kerberos enabled Hbase 2.0 cluster from Phoenix 5.0 client (sqlline). This is my connection URL - jdbc:phoenix:Zk

Re: jdbc not work, phoenix 4.14.0-cdh5.11.2 + kerberos

2018-07-31 Thread Josh Elser
Did you enable DEBUG logging on the client or server side? Certainly if you got a connection timeout, you at least got a stack trace that you could share. You need to provide more information if you want help debugging your setup. On 7/31/18 6:29 AM, anung wrote: Hi All, I have CDH 5.11 clus

Re: Potentially corrupted table

2018-07-31 Thread Josh Elser
I don't recall any big issues on 4.13.2, but I, admittedly, haven't followed it closely. You weren't doing anything weird on your own -- you wrote data via the JDBC driver? Any index tables? Aside from weirdness in the client with statistics, there isn't much I've seen that ever causes a "ba

Re: jdbc not work, phoenix 4.14.0-cdh5.11.2 + kerberos

2018-08-01 Thread Josh Elser
; on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=Regionserver,60020,1533093258500, seqNum=0 ... Thank you, BR, Anung On Wed, Aug 1, 2018 at 1:50 AM Josh Elser wrote: Did you enable DEBUG logging on the client or server side? Certainly if you got a connection timeout

Re: Spark-Phoenix Plugin

2018-08-06 Thread Josh Elser
Besides the distribution and parallelism of Spark as a distributed execution framework, I can't really see how phoenix-spark would be faster than the JDBC driver :). Phoenix-spark and the JDBC driver are using the same code under the hood. Phoenix-spark is using the PhoenixOutputFormat (and th

Re: error when using apache-phoenix-4.14.0-HBase-1.2-bin with hbase 1.2.6

2018-08-07 Thread Josh Elser
"Phoenix-server" refers to the phoenix-$VERSION-server.jar that is either included in the binary tarball or is generated by the official source-release. "Deploying" it means copying the jar to $HBASE_HOME/lib. On 8/6/18 9:56 PM, 倪项菲 wrote: Hi Zhang Yun,     the link you mentioned tells us t

Re: Statements caching

2018-08-16 Thread Josh Elser
You don't have to create a new Connection every time, but it is not directly harmful to do so. This recommendation only goes one way (just because you can create new connections each time, doesn't imply that you have to, nor necessarily want to). I wouldn't be worried about any sort of "health

Re: Searching a string in Phoenix view in all columns

2018-08-16 Thread Josh Elser
SQL doesn't work like this. You can use the DatabaseMetaData class, obtained off of the JDBC Connection class, to inspect the available columns for a query. However, I'd caution you against constructing a massive disjunction, e.g. select * from demotable where colA like ".." or colB like ".."

Re: Phoenix CsvBulkLoadTool fails with java.sql.SQLException: ERROR 103 (08004): Unable to establish connection

2018-08-20 Thread Josh Elser
(-cc user@hbase, +bcc user@hbase) How about the rest of the stacktrace? You didn't share the cause. On 8/20/18 1:35 PM, Mich Talebzadeh wrote: This was working fine before my Hbase upgrade to 1.2.6 I have Hbase version 1.2.6 and Phoenix version apache-phoenix-4.8.1-HBase-1.2-bin This comma

Re: Phoenix CsvBulkLoadTool fails with java.sql.SQLException: ERROR 103 (08004): Unable to establish connection

2018-08-21 Thread Josh Elser
he author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Mon, 20 Aug 2018 at 21:24, Josh Elser mailto:els...@apache.org>> wrote: (-cc user@hbase, +bcc user@hbase) How

Re: CursorUtil -> mapCursorIDQuery

2018-08-21 Thread Josh Elser
Hi Jack, Given your assessment, it sounds like you've stumbled onto a race condition! Thanks for bringing it to our attention. A few questions: * Have you checked if the same code exist in the latest branches/releases (4.x-HBase-1.{2,3,4} or master)? * Want to create a Jira issue to track t

Re: CursorUtil -> mapCursorIDQuery

2018-08-21 Thread Josh Elser
ss it seems has not changed since it was first created as part of PHOENIX-3572 - and is still the same in master (I checked a bit earlier). Sure - will have a go at creating a JIRA for this. Regards, On Tue, 21 Aug 2018 at 16:23, Josh Elser wrote: Hi Jack, Given your assessment, it sounds

Re: Is read-only user of Phoeix table possible?

2018-08-27 Thread Josh Elser
Note, that the functionality that Thomas describes is how we intend Phoenix to work, and may not be how the 4.9 release of Phoenix works (due to changes that have been made). On 8/23/18 12:42 PM, Thomas D'Silva wrote: On a new cluster, the first time a client connects is when the SYSTEM tables

Re: Phoenix Query Server Impersonation With Non-Kerberos HBase

2018-08-28 Thread Josh Elser
In released versions of Apache Phoenix, there are only two authentication models for PQS: None and Kerberos via SPNEGO. Our Karan and Alex have been doing some good work (in what is presently slated to be 4.15.0 and 5.1.0) around allowing pluggable authentication and authorization as a part of

Re: Read-Write data to/from Phoenix 4.13 or 4.14 with Spark SQL Dataframe 2.1.0

2018-09-10 Thread Josh Elser
Lots of details missing here about how you're trying to submit these Spark jobs, but let me try to explain how things work now: Phoenix provides spark(1) and spark2 jars. These JARs provide the implementation for Spark *on top* of what the phoenix-client.jar. You want to include both the phoen

Re: ABORTING region server and following HBase cluster "crash"

2018-09-10 Thread Josh Elser
Did you update the HBase jars on all RegionServers? Make sure that you have all of the Regions assigned (no RITs). There could be a pretty simple explanation as to why the index can't be written to. On 9/9/18 3:46 PM, Batyrshin Alexander wrote: Correct me if im wrong. But looks like if you

Re: Issues with LAST_VALUE aggregation UDF

2018-09-11 Thread Josh Elser
The versions you provided in the description make it sound like you're actually using HDP's distribution of Apache Phoenix, not an official Apache Phoenix release. Please test against an Apache Phoenix release or contact Hortonworks for support. It would not be unheard of that this issue has a

Re: Missing content in phoenix after writing from Spark

2018-09-12 Thread Josh Elser
Reminder: Using Phoenix internals forces you to understand exactly how the version of Phoenix that you're using serializes data. Is there a reason you're not using SQL to interact with Phoenix? Sounds to me that Phoenix is expecting more data at the head of your rowkey. Maybe a salt bucket tha

Re: Missing content in phoenix after writing from Spark

2018-09-13 Thread Josh Elser
correctly on Hbase). We'll be looking into this but if you have any further advice, appreciated. Saif On Wed, Sep 12, 2018 at 5:50 PM Josh Elser mailto:els...@apache.org>> wrote: Reminder: Using Phoenix internals

Re: Salting based on partial rowkeys

2018-09-14 Thread Josh Elser
Yeah, I think that's his point :) For a fine-grained facet, the hotspotting is desirable to co-locate the data for query. To try to make an example to drive this point home: Consider a primary key constraint(col1, col2, col3, col4); If I defined the SALT_HASH based on "col1" alone, you'd get

Re: Missing content in phoenix after writing from Spark

2018-09-17 Thread Josh Elser
/maven2/org/apache/twill/twill-discovery-core/0.13.0/twill-discovery-core-0.13.0.jar Not sure which one I could be missing On Fri, Sep 14, 2018 at 7:34 PM Josh Elser <mailto:els...@apache.org>> wrote: Uh, you're definitely not using the right JARs :) You'll want the

Re: Missing content in phoenix after writing from Spark

2018-09-17 Thread Josh Elser
/apache/twill/twill-discovery-core/0.13.0/twill-discovery-core-0.13.0.jar Not sure which one I could be missing?? On Fri, Sep 14, 2018 at 7:34 PM Josh Elser <mailto:els...@apache.org>> wrote: Uh, you're definitely not using the right JARs :) You'll want the phoenix-client.

Re: Slow query on Secondary Index

2018-09-18 Thread Josh Elser
Your question is suitable for the user@phoenix mailing list. Please do not cross post questions to multiple lists. On 9/18/18 10:57 AM, Vishwajeet Rana wrote: Hi, I have two salted global secondary indexes (A and B) on a table with row key (primary key) as the covered column. For both of them

Re: Read-Write data to/from Phoenix 4.13 or 4.14 with Spark SQL Dataframe 2.1.0

2018-09-18 Thread Josh Elser
. Regards, Liubov Data Engineer IR.ee On Tue, Sep 11, 2018 at 4:06 AM Josh Elser mailto:els...@apache.org>> wrote: Lots of details missing here about how you're trying to submit these Spark jobs, but let me try to explain how things work now:

Re: IllegalStateException: Phoenix driver closed because server is shutting down

2018-09-19 Thread Josh Elser
What version of Phoenix are you using? Is this the full stack trace you see that touches Phoenix (or HBase) classes? On 9/19/18 12:42 PM, Batyrshin Alexander wrote: Is there any reason for this exception? Which exactly server is shutting down if we use quorum of zookepers? java.lang.IllegalSt

Re: Phoenix 5.0 could not commit transaction: org.apache.phoenix.execute.CommitException: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: org.apache.phoenix.hbase

2018-09-25 Thread Josh Elser
Your assumptions are not unreasonable :) Phoenix 5.0.x should certainly work with HBase 2.0.x. Glad to see that it's been corrected already (embarassing that I don't even remember reviewing this). Let me start a thread on dev@phoenix about a 5.0.1 or a 5.1.0. We need to have a Phoenix 5.x that

Re: org.apache.phoenix.shaded.org.apache.thrift.TException: Unable to discover transaction service. -> TException: Unable to discover transaction service.

2018-09-26 Thread Josh Elser
If you're using HBase with Hadoop3, HBase should have Hadoop3 jars. Re-build HBase using the -Dhadoop.profile=3.0 (I think it is) CLI option. On 9/26/18 7:21 AM, Francis Chuang wrote: Upon further investigation, it appears that this is because org.apache.hadoop.security.authentication.util.Kerb

Re: Table dead lock: ERROR 1120 (XCL20): Writes to table blocked until index can be updated

2018-10-02 Thread Josh Elser
HBase will invalidate the location of a Region on seeing certain exceptions (including NotServingRegionException). After it sees the exception you have copied below, it should re-fetch the location of the Region. If HBase keeps trying to access a Region on a RS that isn't hosting it, either h

Re: Specifying HBase cell visibility labels or running as a particular user

2018-10-08 Thread Josh Elser
Hey Mike, You can definitely authenticate yourself as with the Kerberos credentials of your choice. There are generally two ways in you can do this: 1. Login using UserGroupInformation APIs and then make JDBC calls with the Phoenix JDBC driver (thick or thin) 2. Use the principal+keytab JDBC

Re: ON DUPLICATE KEY with Global Index

2018-10-09 Thread Josh Elser
Can you elaborate on what is unclear about the documentation? This exception and the related documentation read as being in support of each other to me. On 10/9/18 5:39 AM, Batyrshin Alexander wrote: Hello all, Documentations (http://phoenix.apache.org/atomic_upsert.html) say: "Although glo

Re: Phoenix metrics error on thin client

2018-10-17 Thread Josh Elser
The methods that you are invoking assume that the Phoenix JDBC driver (the java class org.apache.phoenix.jdbc.PhoenixDriver) is in use. It's not, so you get this error. The Phoenix "thick" JDBC driver is what's running inside of the Phoenix Query Server, just not in your local JVM. As such, yo

Re: Connection Pooling?

2018-10-18 Thread Josh Elser
Batyrshin, you asked about statement caching which is different than connection pooling. @JMS, yes, the FAQ is accurate (as is the majority of the rest of the documentation ;)) On 10/18/18 1:14 PM, Batyrshin Alexander wrote: I've already asked the same question in this thread - http://apache

Re: UPSERT SELECT Isolation Level

2018-10-18 Thread Josh Elser
Hi Owen, I'm not aware of any atomicity on an UPSERT-SELECT. I believe the semantics will be much more like what HBase provides than a higher-level RDBMS isolation level. That is, if you update a cell's value after you start an upsert-select, it is non-deterministic if your upsert-select wil

Re: Phoenix metrics error on thin client

2018-10-23 Thread Josh Elser
y, please let me know On Thu, Oct 18, 2018 at 7:00 PM Monil Gandhi mailto:mgand...@gmail.com>> wrote: Okay. Will take a look. Thanks On Wed, Oct 17, 2018 at 8:28 AM Josh Elser mailto:els...@apache.org>> wrote: The methods that you are invoking as

Re: Phoenix Performances & Uses Cases

2018-10-29 Thread Josh Elser
Specifically to your last two points about windowing, transforming, grouping, etc: my current opinion is that Hive does certain analytical style operations much better than Phoenix. Personally, I don't think it makes sense for Phoenix to try to "catch up". It would take years for us to build su

Re: Phoenix Performances & Uses Cases

2018-10-29 Thread Josh Elser
On 10/29/18 11:39 AM, Nicolas Paris wrote: Thanks Josh, On Mon, Oct 29, 2018 at 10:47:42AM -0400, Josh Elser wrote: Use Hive when Hive does things well, and use Phoenix when Phoenix does it well. That would be great. My concern is the phoenix "joins" do not compete with postgr

Re: Python phoenixdb adapter and JSON serialization on PQS

2018-11-02 Thread Josh Elser
I would strongly suggest you do not use the JSON serialization. The JSON support is implemented via Jackson which has no means to make backwards compatibility "easy". On the contrast, protobuf makes this extremely easy and we have multiple examples over the past years where we've been able to

Re: ABORTING region server and following HBase cluster "crash"

2018-11-02 Thread Josh Elser
<mailto:0x62...@gmail.com>> wrote: > > After update web interface at Master show that every region server now 1.4.7 and no RITS. > > Cluster recovered only when we restart all regions servers 4 times... > >> On 11 Sep 2018, at 04:08, Josh Else

Re: ABORTING region server and following HBase cluster "crash"

2018-11-05 Thread Josh Elser
al indexes, not specific to phoenix. I apologize if my response came off as dismissing phoenix altogether. FWIW, I'm a big advocate of phoenix at my org internally, albeit for the newer version. On Fri, Nov 2, 2018, 4:09 PM Josh Elser <mailto:els...@apache.org>> wrote: I wou

Re: Python phoenixdb adapter and JSON serialization on PQS

2018-11-05 Thread Josh Elser
problems when using the python adapter and not when using sqlline-thin, do they follow different code paths (especially around serialization)? Thanks again, Manoj On Fri, Nov 2, 2018 at 4:05 PM Josh Elser <mailto:els...@apache.org>> wrote: I would strongly suggest you do not use the J

Re: Python phoenixdb adapter and JSON serialization on PQS

2018-11-06 Thread Josh Elser
only HTTP) to a remote server to run the JDBC oeprations. So, yes: an Avatica client is always using HTTP, given whatever serialization you instruct it to use. I'll work on getting some test cases here soon to illustrate this as well as the performance problem. Thanks again! Man

Re: Python phoenixdb adapter and JSON serialization on PQS

2018-11-09 Thread Josh Elser
id not work with fetching all rows at once, i.e., cursor.itersize=-2) Because of these issues, for now we have switched PQS to using JSON serialization and updated our clients to use the same. We’re obviously very much interested in understanding how the protobuf path can be made faster. Tha

Re: Heap Size Recommendation

2018-11-27 Thread Josh Elser
HBASE_HEAPSIZE is just an environment variable which sets the JVM heap size. Your question doesn't make any sense to me. On 11/16/18 8:44 AM, Azharuddin Shaikh wrote: Hi All, We want to improve the read performance of phoenix query for which we are trying to upgrade the HBASE_HEAPSIZE. Curr

Re: JDBC Connection URL to Phoenix on Azure HDInsight

2018-11-27 Thread Josh Elser
Are you trying to use the thick driver (direct HBase connection) or the thin driver (via Phoenix Query Server)? You're providing examples of both. If the thick driver: by default, I believe that HDI configures a root znode of "/hbase" not "/hbase-unsecure" which would be a problem. You need to

Re: client does not have phoenix.schema.isNamespaceMappingEnabled

2018-11-27 Thread Josh Elser
To add a non-jar file to the classpath of a Java application, you must add the directory containing that file to the classpath. Thus, the following is wrong: HADOOP_CLASSPATH=/usr/hdp/3.0.1.0-187/hbase/lib/hbase-protocol.jar:/etc/hbase/3.0.1.0-187/0/hbase-site.xml And should be: HADOOP_CLASS

Re: client does not have phoenix.schema.isNamespaceMappingEnabled

2018-11-29 Thread Josh Elser
.jar:/usr/hdp/3.0.1.0-187/hbase/lib/hbase-zookeeper-2.0.0.3.0.1.0-187.jar:/usr/hdp/3.0.1.0-187/hbase/lib/jackson-annotations-2.9.5.jar:/usr/hdp/3.0.1.0-187/hbase/lib/jackson-core-2.9.5.jar On Tue, Nov 27, 2018 at 4:26 PM Josh Elser <mailto:els...@apache.org>> wrote: To add a

Re: slf4j class files in phoenix-5.0.0-HBase-2.0-client.jar

2018-12-22 Thread Josh Elser
This is as expected. JDBC expects that a database driver provide all of its dependencies in a single jar file. On Mon, Dec 17, 2018 at 4:39 PM Liang Zhao wrote: > Hi, > > > > We found slf4j class files in your phoenix-5.0.0-HBase-2.0-client.jar, > which caused multiple binding of slf4j as the Sp

Re: slf4j class files in phoenix-5.0.0-HBase-2.0-client.jar

2018-12-28 Thread Josh Elser
lient.html > > Sent from my iPhone > > On Dec 22, 2018, at 9:25 AM, Josh Elser wrote: > > This is as expected. JDBC expects that a database driver provide all of its > dependencies in a single jar file. > > On Mon, Dec 17, 2018 at 4:39 PM Liang Zhao wrote: >>

Re: Hbase vs Phienix column names

2019-01-08 Thread Josh Elser
(from the peanut-gallery) That sounds to me like a useful utility to share with others if you're going to write it anyways, Anil :) On 1/8/19 12:54 AM, Thomas D'Silva wrote: There isn't an existing utility that does that. You would have to look up the COLUMN_QUALIFIER for the columns you are

Re: Is it possible to do a dynamic deploy of a newer version of Phoenix coprocessor to specific tables?

2019-01-21 Thread Josh Elser
Owen, There would be significant "unwanted side-effects". You would be taking on a very large burden trying to come up with a corresponding client version of Phoenix which would still work against the newer coprocessors that you are trying to deploy. Phoenix doesn't provide any guarantee of compat

Re: Is it possible to do a dynamic deploy of a newer version of Phoenix coprocessor to specific tables?

2019-01-22 Thread Josh Elser
ient would only be accessing tables with the same Phoenix version. But it maybe that my take has a lot of erroneous assumptions in it, as I haven't looked at the internals of the JDBC driver code. On Mon, 21 Jan 2019 at 18:09, Josh Elser <mailto:els...@apache.org>> wrote: O

Re: split count for mapreduce jobs with PhoenixInputFormat

2019-01-30 Thread Josh Elser
You can extend/customize the PhoenixInputFormat with your own code to increase the number of InputSplits and Mappers. On 1/30/19 6:43 AM, Edwin Litterst wrote: Hi, I am using PhoenixInputFormat as input source for mapreduce jobs. The split count (which determines how many mappers are used for t

Re: split count for mapreduce jobs with PhoenixInputFormat

2019-01-30 Thread Josh Elser
an 30, 2019 at 7:31 AM Josh Elser mailto:els...@apache.org>> wrote: You can extend/customize the PhoenixInputFormat with your own code to increase the number of InputSplits and Mappers. On 1/30/19 6:43 AM, Edwin Litterst wrote: > Hi,

2 weeks remaining for NoSQL Day abstract submission

2019-04-04 Thread Josh Elser
There are just *two weeks* remaining to submit abstracts for NoSQL Day 2019, in Washington D.C. on May 21st. Abstracts are due April 19th. https://dataworkssummit.com/nosql-day-2019/ Abstracts don't need to be more than a paragraph or two. Please the time sooner than later to submit your abstr

Re: Fwd: DELIVERY FAILURE: Error transferring to QCMBSJ601.HERMES.SI.SOCGEN; Maximum hop count exceeded. Message probably in a routing loop.

2019-04-05 Thread Josh Elser
Not just you. I have an email filter set up for folks like this :) A moderator should be able to force a removal. Trying it now to see if I'm a moderator (I don't think I am, but might be able to add myself as one). On 4/4/19 7:15 PM, William Shen wrote: I kept getting this every time I send

Re: Fwd: DELIVERY FAILURE: Error transferring to QCMBSJ601.HERMES.SI.SOCGEN; Maximum hop count exceeded. Message probably in a routing loop.

2019-04-05 Thread Josh Elser
n 4/5/19 3:47 PM, Josh Elser wrote: Not just you. I have an email filter set up for folks like this :) A moderator should be able to force a removal. Trying it now to see if I'm a moderator (I don't think I am, but might be able to add myself as one). On 4/4/19 7:15 PM, Willia

Re: Large differences in query execution time for similar queries

2019-04-22 Thread Josh Elser
Further, I'd try to implement James' suggestions _not_ using the Phoenix Query Server. Remember that the thin-client uses PQS, adding a level of indirection and re-serialization. By using the "thick" driver, you can avoid this overhead which will help you get repeatable test results with less

Re: Phoenix Mapreduce

2019-04-30 Thread Josh Elser
No, you will not "lose" data. You will just have mappers that read from more than one Region (and thus, more than one RegionServer). The hope in this approach is that we can launch Mappers on the same node of the RegionServer hosting your Region and avoid any reading any data over the network.

NoSQL Day on May 21st in Washington D.C.

2019-05-09 Thread Josh Elser
For those of you in/around the Washington D.C. area, NoSQL day is fast approaching. If you've not already signed up, please check out the agenda and consider joining us for a fun and technical day with lots of talks from Apache committers and big names in industry: https://dataworkssummit.co

Re: PQS + Kerberos problems

2019-05-28 Thread Josh Elser
Make sure you have authorization set up correctly between PQS and HBase. Specifically, you must have the appropriate Hadoop proxyuser rules set up in core-site.xml so that HBase will allow PQS to impersonate the PQS end-user. On 5/14/19 11:04 AM, Aleksandr Saraseka wrote: Hello, I have HBase

Re: Fwd: DELIVERY FAILURE: Error transferring to QCMBSJ601.HERMES.SI.SOCGEN; Maximum hop count exceeded. Message probably in a routing loop.

2019-05-28 Thread Josh Elser
at 3:38 PM William Shen <mailto:wills...@marinsoftware.com>> wrote: Thanks Josh for looking into this! On Fri, Apr 5, 2019 at 12:52 PM Josh Elser mailto:els...@apache.org>> wrote: I can't do this right now, but I've asked Infra to give me the

Re: Problem with ROW_TIMESTAMP

2019-06-10 Thread Josh Elser
When you want to use Phoenix to query your data, you're going to have a much better time if you also use Phoenix to load the data. Unless you specifically know what you're doing (and how to properly serialize the data into HBase so that Phoenix can read it), you should use Phoenix to both read

Re: Phoenix 4 to 5 Upgrade Path

2019-06-12 Thread Josh Elser
What version of Phoenix 4 are you coming from? Of note, if you're lagging far behind, you'll get bit by the column encoding turning on by default in 4.10 [1] In general, before we update the system catalog table, we take a snapshot of it, so you can roll back (although this would be manual).

Re: Phoenix 4 to 5 Upgrade Path

2019-06-14 Thread Josh Elser
nds to provide data compatibility at least two versions back? Does this intention apply across major version boundaries? More specifically, does it imply that data produced by 4.14.1 is intended to be compatible with 4.14.2 and 5.0.0? Thank you. vova On Wed, Jun 12, 2019 at 1:18 PM Josh El

Re: A strange question about Phoenix

2019-06-20 Thread Josh Elser
Make sure you have updated statistics for your table. Depending on the last time you created the stats, you may have to manually delete the stats from SYSTEM.STATS (as there are safeguards to prevent re-creating statistics too frequently). There have been some bugs in the past that results fro

Re: Curl kerberized QueryServer using protobuf type

2019-07-02 Thread Josh Elser
Hey Reid, Protobuf is a binary format -- this is error'ing out because you're sending it plain-text. You're going to have quite a hard time constructing messages in bash alone. There are lots of language bindings[1]. You should be able to pick any of these to help encode/decode messages (if

Re: Questions about ZK Load Balancer

2019-07-09 Thread Josh Elser
Yeah, that's correct. I think I requested some documentation to be added by the original author to clarify that it's not end-to-end usable, but I don't think it ever happened. The "load balancer" isn't anything more than service advertisement, IIRC. IMO, the write-up I made here[1] is going

Re: Spark sql query the hive external table mapped from phoenix always throw out Class org.apache.phoenix.hive.PhoenixSerDe not found exception

2019-07-11 Thread Josh Elser
(Moving this over to the user list as that's the appropriate list for this question) Do you get an error? We can't help you with only a "it didn't work" :) I'd suggest that you try to narrow down the scope of the problem: is it unique to Hive external tables? Can you use a different Hive Stor

Re: Alter Table throws java.lang.NullPointerException

2019-07-24 Thread Josh Elser
Please start by sharing the version of Phoenix that you're using. Did you search Jira to see if there was someone else who also reported this issue? On 7/23/19 4:24 PM, Alexander Batyrshin wrote: Hello all, Got this: alter table TEST_TABLE SET APPEND_ONLY_SCHEMA=true; java.lang.NullPointerE

Re: Secondary Indexes - Missing Data in Phoenix

2019-07-25 Thread Josh Elser
Local indexes are stored in the same table as the data. They are "local" to the data. I would not be surprised if you are running into issues because you are using such an old version of Phoenix. On 7/24/19 10:35 PM, Alexander Lytchier wrote: Hi, We are currently using Cloudera as a package

Re: Phoenix Upgrade 4.7 to 4.14 - Cannot use Phoenix

2019-08-06 Thread Josh Elser
Looks like this is a bug in the system table upgrade code path which doesn't handle the jump from 4.7 to 4.14 correctly. This big of a version jump is not tested/supported in Apache. Does CLABS give you a guarantee that this will work? It would likely be good to contact Cloudera support if you

Re: Phoenix client threads

2019-08-06 Thread Josh Elser
Please take a look at the documentation: https://phoenix.apache.org/tuning.html On 7/29/19 4:24 AM, Sumanta Gh wrote: Hi, When we use Phoenix client, there are by default 10 new PHOENIX-SCANNER-RENEW-LEASE-threads created. There are also new threads spawned for hconnection-shared-thread-pool.

Re: Phoenix with multiple HBase masters

2019-08-07 Thread Josh Elser
Great answer, Aleksandr! Also worth mentioning there is only ever one active HBase Master at a time. If you have multiple started, one will be active as the master and the rest will be waiting as a standby in case the current active master dies for some reason (expectedly or unexpectedly). O

Re: Phoenix with multiple HBase masters

2019-08-08 Thread Josh Elser
nt gets list of PQS addresses from ZK, and performs load balancing. Thus ELB won't be needed. On Wed, Aug 7, 2019, 9:01 AM Josh Elser mailto:els...@apache.org>> wrote: Great answer, Aleksandr! Also worth mentioning there is only ever one active HBase

Re: java.io.IOException: Added a key not lexically larger than previous

2019-08-15 Thread Josh Elser
Are you using a local index? Can you share the basics please (HBase and Phoenix versions). I'm not seeing if you've shared this previously on this or another thread. Sorry if you have. Short-answer, it's possible that something around secondary indexing in Phoenix causes this but not possibl

Re: Is there any way to using appropriate index automatically?

2019-08-19 Thread Josh Elser
http://phoenix.apache.org/faq.html#Why_isnt_my_secondary_index_being_used On 8/19/19 6:06 AM, you Zhuang wrote: Phoenix-version: 4.14.3-HBase-1.4-SNAPSHOT hbase-version: 1.4.6 Table: CREATE TABLE test_phoenix.app ( dt integer not null, a bigint not null , b bigint not null , c bigint not null ,

Re: Is there any way to using appropriate index automatically?

2019-08-22 Thread Josh Elser
not contained in the index. This is done by default for local indexes because we know that the table and index data co-reside on the same region server thus ensuring the lookup is local.” I ‘m totally confused. On Aug 20, 2019, at 12:32 AM, Josh Elser <mailto:els...@apache.org>>

Re: On duplicate key update

2019-08-26 Thread Josh Elser
Out of the box, Phoenix will provide the same semantics that HBase does for concurrent updates to a (data) table. https://hbase.apache.org/acid-semantics.html If you're also asking about how index tables remain in sync, the answer is a bit more complicated (and has changed in recent versions).

Re: Is there a way to specify split num or reducer num when creating phoenix table ?

2019-08-29 Thread Josh Elser
Configuring salt buckets is not the same thing as pre-splitting a table. You should not be setting a crazy large number of buckets like you are. If you want more parallelism in the MapReduce job, pre-split along date-boundaries, with the salt bucket taken into consideration (e.g. \x00_date, \x

Re: Any reason for so small phoenix.mutate.batchSize by default?

2019-09-03 Thread Josh Elser
Hey Alexander, Was just poking at the code for this: it looks like this is really just determining the number of mutations that get "processed together" (as opposed to a hard limit). Since you have done some work, I'm curious if you could generate some data to help back up your suggestion:

<    1   2   3   4   >