Re: Timeline consistency using PQS

2017-01-19 Thread Tulasi Paradarami
Could someone clarify how this property is used by Phoenix: phoenix.connection.consistency If I set it in hbase-site.xml, does phoenix utilize it for every query (even queries from PQS)? It's not documented on the website but it's defined in QueryServices.java: // consistency configuration

Timeline consistency using PQS

2017-01-19 Thread Tulasi Paradarami
Hi, Does PQS support HBase's timeline consistency (HBASE-10070)? Looking at the connection properties implementation within Avatica, I see that following are defined: ["transactionIsolation", "schema", "readOnly", "dirty", "autoCommit", "catalog"] but there's isn't a property defined for setting

Re: Moving column family into new table

2017-01-19 Thread Mark Heppner
I'll check when I'm on site tomorrow, but our (much smaller) local cluster is using the default hbase.hregion.max.filesize of 10 GB for HDP. hbase.hregion.majorcompaction is set to 7 days, so I'm sure it would have ran by now. What would be the best filesize limit? Cloudera suggests having 20-200

Re: Moving column family into new table

2017-01-19 Thread Josh Mahonin
It's a bit peculiar that you've got it pre-split to 10 salt buckets, but seeing 400+ partitions. It sounds like HBase is splitting the regions on you, possibly due to the 'hbase.hregion.max.filesize' setting. You should be able to check the HBase Master UI and see the table details to see how many

Re: Moving column family into new table

2017-01-19 Thread Mark Heppner
Jonathan, I do check the queries using EXPLAIN, but it doesn't work the same in Spark. In Spark, I can only see a very generic plan and it only tells me if certain filters are pushed down to Phoenix or not. Query hints are ignored, since they're first translated by the Spark or Hive query

Re: Moving column family into new table

2017-01-19 Thread Jonathan Leech
Do an explain on your query to confirm that it's doing a full scan and not a skip scan. I typically use an in () clause instead of or, especially with compound keys. I have also had to hint queries to use a skip scan, e.g /*+ SKIP_SCAN */. Phoenix seems to do a very good job not reading data

Re: Phoenix tracing did not start

2017-01-19 Thread Mark Heppner
Pradheep, I don't think this works in HDP 2.5 either, I've never been able to get it to work. On Thu, Jan 19, 2017 at 4:29 AM, Ankit Singhal wrote: > Hi Pradheep, > > It seems tracing is not distributed as a part of HDP 2.4.3.0, please work > with your vendor for an

Re: Moving column family into new table

2017-01-19 Thread Mark Heppner
Thanks for the quick reply, Josh! For our demo cluster, we have 5 nodes, so the table was already set to 10 salt buckets. I know you can increase the salt buckets after the table is created, but how do you change the split points? The repartition in Spark seemed to be extremely inefficient, so we

Re: Moving column family into new table

2017-01-19 Thread Josh Mahonin
Hi Mark, At present, the Spark partitions are basically equivalent to the number of regions in the underlying HBase table. This is typically something you can control yourself, either using pre-splitting or salting ( https://phoenix.apache.org/faq.html#Are_there_any_tips_for_optimizing_Phoenix).

Moving column family into new table

2017-01-19 Thread Mark Heppner
Our use case is to analyze images using Spark. The images are typically ~1MB each, so in order to prevent the small files problem in HDFS, we went with HBase and Phoenix. For 20+ million images and metadata, this has been working pretty well so far. Since this is pretty new to us, we didn't create

Re: Phoenix tracing did not start

2017-01-19 Thread Ankit Singhal
Hi Pradheep, It seems tracing is not distributed as a part of HDP 2.4.3.0, please work with your vendor for an appropriate solution. Regards, Ankit Singhal On Thu, Jan 19, 2017 at 4:48 AM, Pradheep Shanmugam < pradheep.shanmu...@infor.com> wrote: > Hi, > > I am using hdp 2.4.3.0-227. I am