Re: Connecting via SquirrelSQL

2014-06-10 Thread Mujtaba Chohan
Last successful build 3.1-SNAPSHOT jars are at https://builds.apache.org/job/Phoenix-3.0-hadoop1/lastSuccessfulBuild/artifact/ On Tue, Jun 10, 2014 at 11:28 AM, anil gupta wrote: > Hi Justin, > > This ticket does not uses ticket cache. You will need to have keytab file > and principal to login. >

Re: OOM error

2014-06-13 Thread Mujtaba Chohan
Fix for 3.0/4.0 branch is already in progress. Jira: PHOENIX-990 . On Fri, Jun 13, 2014 at 9:17 AM, Skanda Prasad wrote: > Hi, > > When I run a group by query on a data size of approx 50 million records, > I'm getting the below oom error with

Re: Error connecting through SQuirrel

2014-08-27 Thread Mujtaba Chohan
Just tested latest Phoenix 4.1 RC https://dist.apache.org/repos/dist/dev/phoenix/phoenix-4.1.0-rc1/bin/ with CDH 5.1 and it works fine. Copy hadoop2/phoenix-4.1.0-server- hadoop2.jar on all region servers+restart and use hadoop2/phoenix-4.1.0-client-hadoop2.jar with your client or use sqline: hadoo

Re: problem about using tracing

2014-09-03 Thread Mujtaba Chohan
Phoenix connection URL should be of this form jdbc:phoenix:zookeeper2,zookeeper1,zookeeper3:2181 On Wed, Sep 3, 2014 at 12:11 PM, Jesse Yates wrote: > It looks like the connection string that the tracing module is using isn't > configured correctly. Is 2181 the client port on which you are run

Re: Phoenix client maven dependencies

2014-09-18 Thread Mujtaba Chohan
Falvio - Client jar is composed on multiple dependency jars extracted in one and is for convenience available in binary download only. This type of bundled jars are not supposed to go in maven repo. as maven automatically resolves required dependencies. To

Re: SALT_BUCKETS

2014-10-09 Thread Mujtaba Chohan
In addition to what Gabriel said, it also depends on kind of queries that you would most frequently execute. If you queries are highly selective, say the filter just aggregates 1% rows of entire table based on row key then keeping salt buckets size low or even without salting might give better perf

Re: Performance options for doing Phoenix full table scans to complete some data statistics and summary collection work

2015-01-08 Thread Mujtaba Chohan
With 100+ columns, using multiple column families will help a lot if your full scan uses only few columns. Also if columns are wide then turning on compression would help if you are seeing disk I/O contention on region servers. On Wednesday, January 7, 2015, James Taylor wrote: > Hi Sun, > Can

Re: Re: Performance options for doing Phoenix full table scans to complete some data statistics and summary collection work

2015-01-13 Thread Mujtaba Chohan
+ > rows. How manay column family > names would you recommend us to apply then?Maybe only two to three column > families are enough? > We had one cluster with 5 nodes. > > Thanks , > Sun. > > -- > ---------- > > Cer

Re: Index Creation

2015-01-22 Thread Mujtaba Chohan
Hi Siddharth, Both scenarios of creating index before or after data load would work. If index is created before data is inserted then overall insert time would increase vs index created after data load then there would be one time cost of creating index. In case your data is immutable then using

Re: Update statistics made query 2-3x slower

2015-02-11 Thread Mujtaba Chohan
To compare performance without stats, try deleting related rows from SYSTEM.STATS or an easier way, just truncate SYSTEM.STATS table from HBase shell and restart your region servers. //mujtaba On Wed, Feb 11, 2015 at 10:29 AM, Vasudevan, Ramkrishna S < ramkrishna.s.vasude...@intel.com> wrote: >

Re: Update statistics made query 2-3x slower

2015-02-12 Thread Mujtaba Chohan
ba for the info, > > Thank you Vasudevan for the explanations, I already used HBase and I agree > it’s hard to have a counter for the table rows (especially if the > tombstones for deleted rows are still there – ie. not compacted yet). > > > > Constantin > > > >

Re: Update statistics made query 2-3x slower

2015-02-13 Thread Mujtaba Chohan
] or [first part of PK between C and D] or > [….] was understood as a full table scan = painfully slow -> but this > worked today after I used a hint in the SELECT /*+ SKIP_SCAN */ which > shouldn’t be mandatory in my opinion). > > > > Regards, > > Constantin > &

Re: Incompatible jars detected between client and server with CsvBulkloadTool

2015-02-26 Thread Mujtaba Chohan
Just tried connecting Sqlline using Phoenix 4.3 with clean HBase 0.98.4-hadoop2 and it worked fine. Any change you are using hadoop1? On Thu, Feb 26, 2015 at 10:38 AM, Naga Vijayapuram wrote: > Hi Sun, > > See my comment in https://issues.apache.org/jira/browse/PHOENIX-1248 … > > || >

Re: Phoenix table scan performance

2015-03-09 Thread Mujtaba Chohan
During your scan with data on single region server (RS), do you see RS blocked on disk I/O due to heavy reads or 100% CPU utilized? if that is the case then having data distributed on 2 RS would effectively cut time in half. On Mon, Mar 9, 2015 at 10:01 AM, Yohan Bismuth wrote: > Hello, > we're

Re: export phoenix table to csv

2015-03-17 Thread Mujtaba Chohan
You can set *outputformat=csv* in sqllline.py as argument for SqlLine and then redirect your query output to a file. //mujtaba On Tue, Mar 17, 2015 at 10:12 AM, Yosi Botzer wrote: > Hi, > > Is there a way to export phoenix table to csv (local file or hdfs)? > > Thanks > >

Re: Maven issue with version 4.5.0

2015-08-13 Thread Mujtaba Chohan
I'll take a look and will update. Thanks, Mujtaba On Thu, Aug 13, 2015 at 8:33 AM, Yiannis Gkoufas wrote: > Hi there, > > When I try to include the following in my pom.xml: > > > org.apache.phoenix > phoenix-core > 4.5.0-HBase-0.98 > provided >

Re: Maven issue with version 4.5.0

2015-08-13 Thread Mujtaba Chohan
Hi Yiannis. Please retry now. Thanks, Mujtaba On Thu, Aug 13, 2015 at 10:44 AM, Mujtaba Chohan wrote: > I'll take a look and will update. > > Thanks, > Mujtaba > > On Thu, Aug 13, 2015 at 8:33 AM, Yiannis Gkoufas > wrote: > >> Hi there, >> >> Wh

Re: NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.protobuf.ProtobufUtil while trying to connect to HBase with Phoenix

2015-09-08 Thread Mujtaba Chohan
Can you try with phoenix-client jar instead of phoenix-client-*minimal* jar? On Tue, Sep 8, 2015 at 10:42 AM, Dmitry Goldenberg wrote: > I'm getting this error while trying to connect to HBase in a clustered > environment. The code seems to work fine in a single node environment. > > The full se

Re: missing rows after using performance.py

2015-09-08 Thread Mujtaba Chohan
Thanks James. Filed https://issues.apache.org/jira/browse/PHOENIX-2240. On Tue, Sep 8, 2015 at 12:38 PM, James Heather wrote: > Thanks. > > I've discovered that the cause is even simpler. With 100M rows, you get > collisions in the primary key in the CSV file. An experiment (capturing the > CSV

Re: [ANNOUNCE] Welcome our newest Committer Dumindu Buddhika

2015-09-18 Thread Mujtaba Chohan
Welcome onboard Dumindu!! On Friday, September 18, 2015, Nick Dimiduk wrote: > Nice work Dumindu! > > On Thu, Sep 17, 2015 at 9:18 PM, Vasudevan, Ramkrishna S < > ramkrishna.s.vasude...@intel.com > wrote: > > > Hi All > > > > Please welcome our newest committer Dumindu Buddhika to the Apache > P

Re: Number of regions in SYSTEM.SEQUENCE

2015-09-22 Thread Mujtaba Chohan
Since Phoenix 4.5.x default has been changed for phoenix.sequence.saltBuckets to not split sequence table. See this

Re: Apache Phoenix Tracing

2015-11-03 Thread Mujtaba Chohan
traceserver.py is in Phoenix 4.6.0. On Tue, Nov 3, 2015 at 12:42 AM, Nanda wrote: > > Hi All, > > I am trying to enable the tracing app as m

Re: phoenix-4.4.0-HBase-1.1-client.jar in maven?

2015-11-25 Thread Mujtaba Chohan
Kristoffer - If you use *phoenix-core* dependency in your pom.xml as described here then it's equivalent to have a project dependency on phoenix-client jar as maven would resolve all dependency needed by phoenix-core. Note that phoenix-client jar is just a

Re: Select by first part of composite primary key, is it effective?

2016-02-01 Thread Mujtaba Chohan
If you are filtering on leading part of row key which is highly selective then you would be better off not using salt buckets all together rather than having 100 parallel scan and block reads in your case. In our test with billion+ row table, non-salted table offer much better performance since it

Re: Select by first part of composite primary key, is it effective?

2016-02-02 Thread Mujtaba Chohan
en you would be better off not using salt buckets all together rather > than having 100 parallel scan and block reads in your case. I > Didn't understand you correctly. What is difference between salted/not > salted table in case of "primary key leading-part select"? > &g

Re: Select by first part of composite primary key, is it effective?

2016-02-02 Thread Mujtaba Chohan
If you know your key space then you can use *SPLIT ON* in your table create DDL. See http://phoenix.apache.org/language On Tue, Feb 2, 2016 at 11:54 AM, Serega Sheypak wrote: > Hm... and what is the right to presplit table then? > > 2016-02-02 18:30 GMT+01:00 Mujtaba Chohan : >

Re: YCSB with Phoenix?

2016-02-19 Thread Mujtaba Chohan
You can apply this patch on YCSB to test out Phoenix with variable number of VARCHAR fields as well as test out combination of single/multiple CFs, compression and salt buckets. See usage details here

Re: looking for help with Pherf setup

2016-02-29 Thread Mujtaba Chohan
This is a ClassNotFoundException. Can you make sure Phoenix jar is available on classpath for Pherf? If Phoenix is available in HBase/lib directory and HBASE_DIR environment variable is set then it should fix it. Also to test out first, you can run pherf_standalone.py with local HBase to see if eve

Re: Speeding Up Group By Queries

2016-03-25 Thread Mujtaba Chohan
That seems excessively slow for 10M rows which should be in order of few seconds at most without index. 1. How wide is your table 2. How many region servers is your data distributed on and what's the heap size? 3. Do you see lots of disk I/O on region servers during aggregation? 4. Can you try your

Re: Speeding Up Group By Queries

2016-03-28 Thread Mujtaba Chohan
----------+ >> | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TRANSACTIONS_COUNTRY_INDEX >> | >> | SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY ["T_COUNTRY"] >> | >> | CLIENT MERGE SORT >> | >> >> +

Re: Speeding Up Group By Queries

2016-03-29 Thread Mujtaba Chohan
do not seem to help out much in > terms of the timings. Kindly find the phoenix log file attached. Let me > know if I am missing anything. > > Thanks, > Amit. > > On Mon, Mar 28, 2016 at 11:44 PM, Mujtaba Chohan > wrote: > >> Here's the chart for time it takes

Re: Tephra not starting correctly.

2016-03-30 Thread Mujtaba Chohan
Few pointers: - phoenix-core-*.jar is a subset of phoenix-*-server.jar so just phoenix-*-server.jar in hbase/lib is enough for region servers and master. - phoenix-server-*-runnable.jar and phoenix-*-server.jar should be enough for query server. Client jar would only duplicate HBase classes in hba

Re: Tephra not starting correctly.

2016-03-30 Thread Mujtaba Chohan
at > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) > > Any ideas how this can be fixed? > > > > On 31/03/2016 4:47 AM, Mujtaba Chohan wrote: > > Few pointers: > > - phoenix-core-*.jar is a subset of phoenix-*-server.jar so just > phoeni

Re: Tephra not starting correctly.

2016-03-30 Thread Mujtaba Chohan
2.4.0.0-169/hadoop-mapreduce/.//gson-2.2.4.jar:/usr/hdp/2.4.0.0-169/hado > op-mapreduce/.//hadoop-auth.jar:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/.//hadoop-mapreduce-client-jobclient-2.7.1.2.4.0.0-169-tests.jar:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/.//commons-lang3-3.3.2.jar:/usr/hdp/2.4.0.0-169/h

Re: Tephra not starting correctly.

2016-03-31 Thread Mujtaba Chohan
y and it > was able to become the leader. > > Do you think this might be a bug? > > On 31/03/2016 11:53 AM, Mujtaba Chohan wrote: > > I still see you have the following on classpath: > opt/hbase/phoenix-assembly/target/* > > On Wed, Mar 30, 2016 at 5:42 PM, F21 wrote:

Re: Region Server Crash On Upsert Query Execution

2016-03-31 Thread Mujtaba Chohan
Can you attached last couple of hundred lines from RS log before it crashed? Also what's the RS heap size? On Thu, Mar 31, 2016 at 1:48 AM, Amit Shah wrote: > Hi, > > We have been experimenting hbase (version 1.0) and phoenix (version 4.6) > for our OLAP workload. In order to precalculate aggre

Re: Region Server Crash On Upsert Query Execution

2016-03-31 Thread Mujtaba Chohan
logs are attached. Let me know your inputs. >> >> Thanks, >> Amit. >> >> >> On Thu, Mar 31, 2016 at 6:15 PM, Mujtaba Chohan >> wrote: >> >>> Can you attached last couple of hundred lines from RS log before it >>> crashed? Also

Re: Phoenix Performance issue

2016-05-10 Thread Mujtaba Chohan
Tried the following in Sqlline/Phoenix and HBase shell. Both take ~20ms for point lookups with local HBase. hbase(main):015:0> get 'MYTABLE','a' COLUMN CELL 0:MYCOLtimestamp=1462515518048, value=b 0:_0 timestamp=1462515518048, v

Re: Phoenix Performance issue

2016-05-11 Thread Mujtaba Chohan
This is with 4.5.2-HBase-0.98 and 4.x-HBase-0.98 head, got almost the same numbers with both. On Wed, May 11, 2016 at 12:19 AM, Naveen Nahata wrote: > Thanks Mujtaba. > > Could you tell me which version of phoenix are you using ? > > -Naveen Nahata > > On 11 May 2016 at

Re: Number of Columns in Phoenix Table

2016-06-29 Thread Mujtaba Chohan
I haven't exhaustively perf. tested but I have a Phoenix table with 15K columns in a single column family storing values in only 20 or so columns per row and it's performance seems on par with table with few columns. On Wed, Jun 29, 2016 at 3:27 AM, Siddharth Ubale < siddharth.ub...@syncoms.com> w

Re: Phoenix performance at scale

2016-07-08 Thread Mujtaba Chohan
> > How do response times vary as the number of rows in a table increases? > How do response times vary as the number of HBase nodes increases? > It's linear but there are many factors how that linear line/curve looks like as it depends on the type of query you are executing and how data gets spre

Re: Index tables at scale

2016-07-11 Thread Mujtaba Chohan
12 index tables * 256 region per table = ~3K regions for index tables assuming we are talking of covered index which implies 200+ regions/region server on a 15 node cluster. On Mon, Jul 11, 2016 at 1:58 PM, James Taylor wrote: > Hi Simon, > > I might be missing something, but with 12 separate in

Re: Index tables at scale

2016-07-11 Thread Mujtaba Chohan
ions and indexes, the num of regions/region >> server can grow quickly. >> >> -Simon >> >> On Jul 11, 2016, at 2:17 PM, Mujtaba Chohan wrote: >> >> 12 index tables * 256 region per table = ~3K regions for index tables >> assuming we are talking of covere

Re: phoenix.query.maxServerCacheBytes not used

2016-07-19 Thread Mujtaba Chohan
phoenix.query.maxServerCacheBytes is a client side parameter. If you are using bin/sqlline.py then set this property in bin/hbase-site.xml and restart sqlline. - mujtaba On Tue, Jul 19, 2016 at 1:59 PM, Nathan Davis wrote: > Hi, > I am running a standalone HBase locally with Phoenex installed b

Re: Local Phoenix installation for testing

2016-07-21 Thread Mujtaba Chohan
Would be simpler and reliable if you used mini-cluster with Phoenix for unit tests. In your project which should already have phoenix-core as a dependency just extend your test class from org.apache.phoenix.end2end.BaseHBaseManagedTimeIT which will take care of mini cluster setup with Phoenix. Exam

Re: How to tell when an insertion has "finished"

2016-07-28 Thread Mujtaba Chohan
Query running first time would be slower since data is not in HBase cache rather than things being not settled. Replication shouldn't be putting load on cluster which you can check by turning replication off. On HBase side to force things to be optimal before running perf queries is to do a major c

Re: How to tell when an insertion has "finished"

2016-07-28 Thread Mujtaba Chohan
would return while replication may still be occurring. > > On Thu, Jul 28, 2016 at 12:06 PM, Mujtaba Chohan > wrote: > >> Query running first time would be slower since data is not in HBase cache >> rather than things being not settled. Replication shouldn't be p

Re:

2016-07-28 Thread Mujtaba Chohan
To use pherf-cluster.py script make sure $HBASE_DIR/bin/hbase file is available which is used to construct classpath. Also add the following line to script before java_cmd is executed to make sure *hbasecp* variable contains phoenix jar: print "Classpath used to launch pherf: " + hbasecp Also try

Re: Issues while Running Apache Phoenix against TPC-H data

2016-08-12 Thread Mujtaba Chohan
Hi Amit, * What's the heap size of each of your region servers? * Do you see huge amount of disk reads when you do a select count(*) from tpch.lineitem? If yes then try setting snappy compression on your table followed by major compaction * Were there any deleted rows in this table? What's the row

Re: Phoenix has slow response times compared to HBase

2016-08-31 Thread Mujtaba Chohan
Something seems inherently wrong in these test results. * How are you running Phoenix queries? Were the concurrent Phoenix queries using the same JVM? Was the JVM restarted after changing number of concurrent users? * Is the response time plotted when query is executed for the first time or second

Re: Phoenix has slow response times compared to HBase

2016-08-31 Thread Mujtaba Chohan
es and your HBase equivalent code? * Any phoenix tuning defaults that you changed? Thanks, Mujtaba (previous response wasn't complete before I hit send) On Wed, Aug 31, 2016 at 10:40 AM, Mujtaba Chohan wrote: > Something seems inherently wrong in these test results. > > * Ho

Re: Question regarding designing row keys

2016-10-04 Thread Mujtaba Chohan
If you lead with timestamp key, you might want to consider experimenting with salting as writing data would hotspot on single region if keys are monotonically increasing. On Tue, Oct 4, 2016 at 8:04 AM, Ciureanu Constantin < ciureanu.constan...@gmail.com> wr

Re: Using Apache perf with Hbase 1.1

2016-10-18 Thread Mujtaba Chohan
> > Cannot get all table regions > Check that there are no offline regions. See related thread here . On Tue, Oct 18, 2016 at 2:11 PM, Pradheep Shanmugam < pradheep.shanmu...@info

Re: Export large query results to CSV

2017-05-15 Thread Mujtaba Chohan
You might be able to use sqlline to export. Use !outputformat csv and !record commands to export as CSV locally. On Sun, May 14, 2017 at 8:22 PM, Josh Elser wrote: > I am not aware of any mechanisms in Phoenix that will automatically write > formatted data, locally or remotely. This will require

Re: Short Tables names and column names

2017-05-30 Thread Mujtaba Chohan
Holds true for Phoenix as well and it provides built-in support for column mapping so you can still use long column names, see http://phoenix.apache.org/columnencoding.html. Also see related performance optimization, SINGLE_CELL_ARRAY_WITH_OFFSETS encoding for immutable data. On Tue, May 30, 2017

Re: Short Tables names and column names

2017-05-30 Thread Mujtaba Chohan
> > please help. > > thanks, > -ash > > > On Tue, May 30, 2017 at 4:16 PM, Mujtaba Chohan > wrote: > >> Holds true for Phoenix as well and it provides built-in support for >> column mapping so you can still use long column names, see >> http://pho

Re: Spark & UpgradeInProgressException: Cluster is being concurrently upgraded from 4.11.x to 4.12.x

2017-11-10 Thread Mujtaba Chohan
Probably being hit by https://issues.apache.org/jira/browse/PHOENIX-4335. Please upgrade to 4.13.0 which will be available by EOD today. On Fri, Nov 10, 2017 at 8:37 AM, Stepan Migunov < stepan.migu...@firstlinesoftware.com> wrote: > Hi, > > > > I have just upgraded my cluster to Phoenix 4.12 and

Re: Efficient way to get the row count of a table

2017-12-19 Thread Mujtaba Chohan
Another alternate outside Phoenix is to use http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html M/R. On Tue, Dec 19, 2017 at 3:18 PM, James Taylor wrote: > If it needs to be 100% accurate, then count(*) is the only way. If your > data is write-once data, you might b

Re: Is first query to a table region way slower?

2018-01-29 Thread Mujtaba Chohan
Just to remove one variable, can you repeat the same test after truncating Phoenix Stats table? (either truncate SYSTEM.STATS from HBase shell or use sql: delete from SYSTEM.STATS) On Mon, Jan 29, 2018 at 4:36 PM, Pedro Boado wrote: > Yes there is a rs.next(). > > In fact if I run this SELECT *