Re: Is first query to a table region way slower?

2018-01-29 Thread Mujtaba Chohan
Just to remove one variable, can you repeat the same test after truncating Phoenix Stats table? (either truncate SYSTEM.STATS from HBase shell or use sql: delete from SYSTEM.STATS) On Mon, Jan 29, 2018 at 4:36 PM, Pedro Boado wrote: > Yes there is a rs.next(). > > In fact

Re: Efficient way to get the row count of a table

2017-12-19 Thread Mujtaba Chohan
Another alternate outside Phoenix is to use http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html M/R. On Tue, Dec 19, 2017 at 3:18 PM, James Taylor wrote: > If it needs to be 100% accurate, then count(*) is the only way. If your > data is

Re: Spark & UpgradeInProgressException: Cluster is being concurrently upgraded from 4.11.x to 4.12.x

2017-11-10 Thread Mujtaba Chohan
Probably being hit by https://issues.apache.org/jira/browse/PHOENIX-4335. Please upgrade to 4.13.0 which will be available by EOD today. On Fri, Nov 10, 2017 at 8:37 AM, Stepan Migunov < stepan.migu...@firstlinesoftware.com> wrote: > Hi, > > > > I have just upgraded my cluster to Phoenix 4.12

Re: Short Tables names and column names

2017-05-30 Thread Mujtaba Chohan
t; > please help. > > thanks, > -ash > > > On Tue, May 30, 2017 at 4:16 PM, Mujtaba Chohan <mujt...@apache.org> > wrote: > >> Holds true for Phoenix as well and it provides built-in support for >> column mapping so you can still use long colu

Re: Short Tables names and column names

2017-05-30 Thread Mujtaba Chohan
Holds true for Phoenix as well and it provides built-in support for column mapping so you can still use long column names, see http://phoenix.apache.org/columnencoding.html. Also see related performance optimization, SINGLE_CELL_ARRAY_WITH_OFFSETS encoding for immutable data. On Tue, May 30, 2017

Re: Export large query results to CSV

2017-05-15 Thread Mujtaba Chohan
You might be able to use sqlline to export. Use !outputformat csv and !record commands to export as CSV locally. On Sun, May 14, 2017 at 8:22 PM, Josh Elser wrote: > I am not aware of any mechanisms in Phoenix that will automatically write > formatted data, locally or

Re: Using Apache perf with Hbase 1.1

2016-10-18 Thread Mujtaba Chohan
> > Cannot get all table regions > Check that there are no offline regions. See related thread here . On Tue, Oct 18, 2016 at 2:11 PM, Pradheep Shanmugam <

Re: Question regarding designing row keys

2016-10-04 Thread Mujtaba Chohan
If you lead with timestamp key, you might want to consider experimenting with salting as writing data would hotspot on single region if keys are monotonically increasing. On Tue, Oct 4, 2016 at 8:04 AM, Ciureanu Constantin < ciureanu.constan...@gmail.com>

Re: Phoenix has slow response times compared to HBase

2016-08-31 Thread Mujtaba Chohan
d your HBase equivalent code? * Any phoenix tuning defaults that you changed? Thanks, Mujtaba (previous response wasn't complete before I hit send) On Wed, Aug 31, 2016 at 10:40 AM, Mujtaba Chohan <mujt...@apache.org> wrote: > Something seems inherently wrong in these test results. &g

Re: Phoenix has slow response times compared to HBase

2016-08-31 Thread Mujtaba Chohan
Something seems inherently wrong in these test results. * How are you running Phoenix queries? Were the concurrent Phoenix queries using the same JVM? Was the JVM restarted after changing number of concurrent users? * Is the response time plotted when query is executed for the first time or

Re:

2016-07-28 Thread Mujtaba Chohan
To use pherf-cluster.py script make sure $HBASE_DIR/bin/hbase file is available which is used to construct classpath. Also add the following line to script before java_cmd is executed to make sure *hbasecp* variable contains phoenix jar: print "Classpath used to launch pherf: " + hbasecp Also try

Re: How to tell when an insertion has "finished"

2016-07-28 Thread Mujtaba Chohan
you're right that > Phoenix would return while replication may still be occurring. > > On Thu, Jul 28, 2016 at 12:06 PM, Mujtaba Chohan <mujt...@apache.org> > wrote: > >> Query running first time would be slower since data is not in HBase cache >> rather than things being not s

Re: How to tell when an insertion has "finished"

2016-07-28 Thread Mujtaba Chohan
Query running first time would be slower since data is not in HBase cache rather than things being not settled. Replication shouldn't be putting load on cluster which you can check by turning replication off. On HBase side to force things to be optimal before running perf queries is to do a major

Re: Local Phoenix installation for testing

2016-07-21 Thread Mujtaba Chohan
Would be simpler and reliable if you used mini-cluster with Phoenix for unit tests. In your project which should already have phoenix-core as a dependency just extend your test class from org.apache.phoenix.end2end.BaseHBaseManagedTimeIT which will take care of mini cluster setup with Phoenix.

Re: phoenix.query.maxServerCacheBytes not used

2016-07-19 Thread Mujtaba Chohan
phoenix.query.maxServerCacheBytes is a client side parameter. If you are using bin/sqlline.py then set this property in bin/hbase-site.xml and restart sqlline. - mujtaba On Tue, Jul 19, 2016 at 1:59 PM, Nathan Davis wrote: > Hi, > I am running a standalone HBase

Re: Index tables at scale

2016-07-11 Thread Mujtaba Chohan
actly what I meant. While not all >> our tables needs these many regions and indexes, the num of regions/region >> server can grow quickly. >> >> -Simon >> >> On Jul 11, 2016, at 2:17 PM, Mujtaba Chohan <mujt...@apache.org> wrote: >> >> 12 index table

Re: Index tables at scale

2016-07-11 Thread Mujtaba Chohan
12 index tables * 256 region per table = ~3K regions for index tables assuming we are talking of covered index which implies 200+ regions/region server on a 15 node cluster. On Mon, Jul 11, 2016 at 1:58 PM, James Taylor wrote: > Hi Simon, > > I might be missing

Re: Phoenix performance at scale

2016-07-08 Thread Mujtaba Chohan
> > How do response times vary as the number of rows in a table increases? > How do response times vary as the number of HBase nodes increases? > It's linear but there are many factors how that linear line/curve looks like as it depends on the type of query you are executing and how data gets

Re: Number of Columns in Phoenix Table

2016-06-29 Thread Mujtaba Chohan
I haven't exhaustively perf. tested but I have a Phoenix table with 15K columns in a single column family storing values in only 20 or so columns per row and it's performance seems on par with table with few columns. On Wed, Jun 29, 2016 at 3:27 AM, Siddharth Ubale < siddharth.ub...@syncoms.com>

Re: Phoenix Performance issue

2016-05-11 Thread Mujtaba Chohan
ta > > On 11 May 2016 at 04:12, Mujtaba Chohan <mujt...@apache.org> wrote: > >> Tried the following in Sqlline/Phoenix and HBase shell. Both take ~20ms >> for >> point lookups with local HBase. >> >> hbase(main):015:0> get 'MYTABLE','a' >> COLUMN &g

Re: Region Server Crash On Upsert Query Execution

2016-03-31 Thread Mujtaba Chohan
server jvm crash. For one >> of such errors, the logs are attached. Let me know your inputs. >> >> Thanks, >> Amit. >> >> >> On Thu, Mar 31, 2016 at 6:15 PM, Mujtaba Chohan <mujt...@apache.org> >> wrote: >> >>> Can you attached la

Re: Tephra not starting correctly.

2016-03-31 Thread Mujtaba Chohan
/tephra start worked correctly and it > was able to become the leader. > > Do you think this might be a bug? > > On 31/03/2016 11:53 AM, Mujtaba Chohan wrote: > > I still see you have the following on classpath: > opt/hbase/phoenix-assembly/target/* > > On Wed, Mar 30, 20

Re: Tephra not starting correctly.

2016-03-30 Thread Mujtaba Chohan
.//asm-3.2.jar:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/.//gson-2.2.4.jar:/usr/hdp/2.4.0.0-169/hado > op-mapreduce/.//hadoop-auth.jar:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/.//hadoop-mapreduce-client-jobclient-2.7.1.2.4.0.0-169-tests.jar:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/.//commons-lang3-3.3

Re: Tephra not starting correctly.

2016-03-30 Thread Mujtaba Chohan
ketNIO.java:361) > at > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) > > Any ideas how this can be fixed? > > > > On 31/03/2016 4:47 AM, Mujtaba Chohan wrote: > > Few pointers: > > - phoenix-core-*.jar is a subset of phoenix-*-server.j

Re: Tephra not starting correctly.

2016-03-30 Thread Mujtaba Chohan
Few pointers: - phoenix-core-*.jar is a subset of phoenix-*-server.jar so just phoenix-*-server.jar in hbase/lib is enough for region servers and master. - phoenix-server-*-runnable.jar and phoenix-*-server.jar should be enough for query server. Client jar would only duplicate HBase classes in

Re: Speeding Up Group By Queries

2016-03-29 Thread Mujtaba Chohan
again with 10 mil records. They do not seem to help out much in > terms of the timings. Kindly find the phoenix log file attached. Let me > know if I am missing anything. > > Thanks, > Amit. > > On Mon, Mar 28, 2016 at 11:44 PM, Mujtaba Chohan <mujt...@apache.org> > wrote:

Re: Speeding Up Group By Queries

2016-03-28 Thread Mujtaba Chohan
--+ >> | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TRANSACTIONS_COUNTRY_INDEX >> | >> | SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY ["T_COUNTRY"] >> | >> | CLIENT MERGE SORT >> | >> >> +--

Re: Speeding Up Group By Queries

2016-03-25 Thread Mujtaba Chohan
That seems excessively slow for 10M rows which should be in order of few seconds at most without index. 1. How wide is your table 2. How many region servers is your data distributed on and what's the heap size? 3. Do you see lots of disk I/O on region servers during aggregation? 4. Can you try

Re: looking for help with Pherf setup

2016-02-29 Thread Mujtaba Chohan
This is a ClassNotFoundException. Can you make sure Phoenix jar is available on classpath for Pherf? If Phoenix is available in HBase/lib directory and HBASE_DIR environment variable is set then it should fix it. Also to test out first, you can run pherf_standalone.py with local HBase to see if

Re: YCSB with Phoenix?

2016-02-19 Thread Mujtaba Chohan
You can apply this patch on YCSB to test out Phoenix with variable number of VARCHAR fields as well as test out combination of single/multiple CFs, compression and salt buckets. See usage details here

Re: Select by first part of composite primary key, is it effective?

2016-02-02 Thread Mujtaba Chohan
your case. I >>> Didn't understand you correctly. What is difference between salted/not >>> salted table in case of "primary key leading-part select"? >>> >>> 2016-02-02 1:18 GMT+01:00 Mujtaba Chohan <mujt...@apache.org>: >>> >>&g

Re: Select by first part of composite primary key, is it effective?

2016-02-02 Thread Mujtaba Chohan
rt select"? > > 2016-02-02 1:18 GMT+01:00 Mujtaba Chohan <mujt...@apache.org > <javascript:_e(%7B%7D,'cvml','mujt...@apache.org');>>: > >> If you are filtering on leading part of row key which is highly selective >> then you would be better off not usin

Re: Select by first part of composite primary key, is it effective?

2016-02-01 Thread Mujtaba Chohan
If you are filtering on leading part of row key which is highly selective then you would be better off not using salt buckets all together rather than having 100 parallel scan and block reads in your case. In our test with billion+ row table, non-salted table offer much better performance since it

Re: phoenix-4.4.0-HBase-1.1-client.jar in maven?

2015-11-25 Thread Mujtaba Chohan
Kristoffer - If you use *phoenix-core* dependency in your pom.xml as described here then it's equivalent to have a project dependency on phoenix-client jar as maven would resolve all dependency needed by phoenix-core. Note that phoenix-client jar is just

Re: Apache Phoenix Tracing

2015-11-03 Thread Mujtaba Chohan
traceserver.py is in Phoenix 4.6.0. On Tue, Nov 3, 2015 at 12:42 AM, Nanda wrote: > > Hi All, > > I am trying to

Re: Number of regions in SYSTEM.SEQUENCE

2015-09-22 Thread Mujtaba Chohan
Since Phoenix 4.5.x default has been changed for phoenix.sequence.saltBuckets to not split sequence table. See this

Re: [ANNOUNCE] Welcome our newest Committer Dumindu Buddhika

2015-09-18 Thread Mujtaba Chohan
Welcome onboard Dumindu!! On Friday, September 18, 2015, Nick Dimiduk wrote: > Nice work Dumindu! > > On Thu, Sep 17, 2015 at 9:18 PM, Vasudevan, Ramkrishna S < > ramkrishna.s.vasude...@intel.com > wrote: > > > Hi All > > > > Please welcome our newest committer

Re: NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.protobuf.ProtobufUtil while trying to connect to HBase with Phoenix

2015-09-08 Thread Mujtaba Chohan
Can you try with phoenix-client jar instead of phoenix-client-*minimal* jar? On Tue, Sep 8, 2015 at 10:42 AM, Dmitry Goldenberg wrote: > I'm getting this error while trying to connect to HBase in a clustered > environment. The code seems to work fine in a single node

Re: missing rows after using performance.py

2015-09-08 Thread Mujtaba Chohan
Thanks James. Filed https://issues.apache.org/jira/browse/PHOENIX-2240. On Tue, Sep 8, 2015 at 12:38 PM, James Heather wrote: > Thanks. > > I've discovered that the cause is even simpler. With 100M rows, you get > collisions in the primary key in the CSV file. An

Re: Maven issue with version 4.5.0

2015-08-13 Thread Mujtaba Chohan
I'll take a look and will update. Thanks, Mujtaba On Thu, Aug 13, 2015 at 8:33 AM, Yiannis Gkoufas johngou...@gmail.com wrote: Hi there, When I try to include the following in my pom.xml: dependency groupIdorg.apache.phoenix/groupId

Re: Maven issue with version 4.5.0

2015-08-13 Thread Mujtaba Chohan
Hi Yiannis. Please retry now. Thanks, Mujtaba On Thu, Aug 13, 2015 at 10:44 AM, Mujtaba Chohan mujt...@apache.org wrote: I'll take a look and will update. Thanks, Mujtaba On Thu, Aug 13, 2015 at 8:33 AM, Yiannis Gkoufas johngou...@gmail.com wrote: Hi there, When I try to include

Re: Phoenix table scan performance

2015-03-09 Thread Mujtaba Chohan
During your scan with data on single region server (RS), do you see RS blocked on disk I/O due to heavy reads or 100% CPU utilized? if that is the case then having data distributed on 2 RS would effectively cut time in half. On Mon, Mar 9, 2015 at 10:01 AM, Yohan Bismuth yohan.bismu...@gmail.com

Re: Incompatible jars detected between client and server with CsvBulkloadTool

2015-02-26 Thread Mujtaba Chohan
Just tried connecting Sqlline using Phoenix 4.3 with clean HBase 0.98.4-hadoop2 and it worked fine. Any change you are using hadoop1? On Thu, Feb 26, 2015 at 10:38 AM, Naga Vijayapuram naga_vijayapu...@gap.com wrote: Hi Sun, See my comment in

Re: Update statistics made query 2-3x slower

2015-02-13 Thread Mujtaba Chohan
used a hint in the SELECT /*+ SKIP_SCAN */ which shouldn’t be mandatory in my opinion). Regards, Constantin *From:* Mujtaba Chohan [mailto:mujt...@apache.org] *Sent:* Thursday, February 12, 2015 9:20 PM *To:* user@phoenix.apache.org *Subject:* Re: Update statistics made query 2-3x

Re: Update statistics made query 2-3x slower

2015-02-11 Thread Mujtaba Chohan
To compare performance without stats, try deleting related rows from SYSTEM.STATS or an easier way, just truncate SYSTEM.STATS table from HBase shell and restart your region servers. //mujtaba On Wed, Feb 11, 2015 at 10:29 AM, Vasudevan, Ramkrishna S ramkrishna.s.vasude...@intel.com wrote:

Re: Performance options for doing Phoenix full table scans to complete some data statistics and summary collection work

2015-01-08 Thread Mujtaba Chohan
With 100+ columns, using multiple column families will help a lot if your full scan uses only few columns. Also if columns are wide then turning on compression would help if you are seeing disk I/O contention on region servers. On Wednesday, January 7, 2015, James Taylor jamestay...@apache.org

Re: problem about using tracing

2014-09-03 Thread Mujtaba Chohan
Phoenix connection URL should be of this form jdbc:phoenix:zookeeper2,zookeeper1,zookeeper3:2181 On Wed, Sep 3, 2014 at 12:11 PM, Jesse Yates jesse.k.ya...@gmail.com wrote: It looks like the connection string that the tracing module is using isn't configured correctly. Is 2181 the client