My guess is that in order to enforce the limit that it’s effectively single
threaded in either the select or the upsert.
> On Dec 17, 2018, at 6:43 PM, Shawn Li wrote:
>
> Hi Vincent,
>
> Thanks for explaining. That makes much more sense now and it explains the
> high memory usage when
I would try writing the hourly values as 24 columns in a daily row, or as an
array type.
I’m not up to speed on the latest Phoenix features, but if it could update a
daily sum on the fly that might be ok. If that doesn’t exist yet or isn’t
performant, it could be done in an Hbase coprocessor.
This seems similar to a failure scenario I’ve seen a couple times. I believe
after multiple restarts you got lucky and tables were brought up by Hbase in
the correct order.
What happens is some kind of semi-catastrophic failure where 1 or more region
servers go down with edits that weren’t
Did you set the split policy to CostantSizeRegionSplitPolicy?
> On Mar 22, 2018, at 2:56 PM, Adi Kadimetla wrote:
>
> Group,
> TABLE - with 50 salt buckets and configured as time series table.
>
> Having pre split into 50 SALT buckets we disabled the region splits using
I’ve done what you’re looking for by selecting the pk from the index in a
nested query and filtering the other column separately.
> On Feb 27, 2018, at 6:39 AM, Alexey Karpov wrote:
>
> Thanks for quick answer, but my case is a slightly different. I've seen these
> links
I agree here but will go farther. Hbase needs an asynchronous api that goes
further than its current capability, for example like building lamda functions
in the client tier that execute in a java streams manner. Being able to run
mapping functions, aggregations, etc without needing
I had an idea a while back that I’ll share here because it’s relevant. It’s
basically a combined index, or index group, and it would work in this case. It
could be implemented in both global and local indexes. The data for two or more
indexes would be interleaved. For a local index, the
I recognize that name. Some of his posts were... memorable. I'm not surprised
to hear he was banned.
> On Aug 21, 2017, at 11:06 PM, James Taylor wrote:
>
> Hi Pawan,
> Why would you listen to someone about the future of Apache Phoenix who has no
> involvement in or
I think you want scan_next_rate for reads and mutate_rate for writes.
> On Jul 26, 2017, at 3:53 AM, Batyrshin Alexander <0x62...@gmail.com> wrote:
>
>
>> On 26 Jul 2017, at 12:49, Batyrshin Alexander <0x62...@gmail.com> wrote:
>>
>> Hello,
>> Im collecting metrics from region servers -
>>
with that approach? For example, if I wanted
> to change a PK column type from VARCHAR to FLOAT, is this possible?
>
>
>
>> On Sun, Jun 18, 2017 at 10:50 AM, Jonathan Leech <jonat...@gmail.com> wrote:
>> Also, if you're updating that many values and not doing
, such as building or rebuilding indexes.
> On Jun 18, 2017, at 11:41 AM, Jonathan Leech <jonat...@gmail.com> wrote:
>
> Another thing to consider, but only if your 1:1 mapping keeps the primary
> keys the same, is to snapshot the table and restore it with the new name, and
> a sch
Another thing to consider, but only if your 1:1 mapping keeps the primary keys
the same, is to snapshot the table and restore it with the new name, and a
schema that is the union of the old and new schemas. I would put the new
columns in a new column family. Then use upsert select, mapreduce,
dex?
>
>
> -- Original --
> From: "Jonathan Leech";<jonat...@gmail.com>;
> Date: Sun, Jun 4, 2017 01:26 PM
> To: "user"<user@phoenix.apache.org>;
> Subject: Re: build index on existing big table
>
> Give Hbase region se
Give Hbase region servers lots of memory, set number of Hbase store files and
blocking files way high. Major compact before and after. You can create an
index a sync with map reduce but not rebuild AFAIK. Also if rebuilding one or
more local index, I found it better to drop it first in Hbase,
There are edits to make in a few files due to API changes in Spark 2.x. They
are all in one git commit in Phoenix-Spark.
> On May 31, 2017, at 1:11 AM, cmbendre wrote:
>
> I saw that JIRA. But the issue is i am using Phoenix on AWS EMR, which comes
> with 4.9.0. I
ta directly from HDFS. not go through
> phoenix/hbase fir access.
>
> Is this possible?
>
>
> Best regards
>
> On May 23, 2017 3:35 PM, "Jonathan Leech" <jonat...@gmail.com> wrote:
> I think you would use Spark for that, via the Phoenix spark plug
Client merge sort is just merging already sorted data from the parallel scan.
Look into the number of simultaneous queries vs the Phoenix thread pool size
and numActiveHandlers in Hbase region servers. Salting might not be helping
you. Also try setting the fetch size on the query in JDBC. Make
Take a look at SOLR and Lucene. You should be able to a text search on the
Hbase data written via Phoenix. It works via the hbase replication mechanism so
should be near-real time. I think you would have to use the SOLR API to do the
initial search, which would get you the Hbase rowkey, which
AND
> CREATE_DT < TIMESTAMP '2017-04-01 00:00:00.000')
>SERVER AGGREGATE INTO DISTINCT ROWS BY [KEYWORD]
> CLIENT MERGE SORT
> CLIENT 100 ROW LIMIT
>
> 3. ROW_TIMESTAMP is time of current query execution time, right?
> Then it's not a right choice. :-(
>
>
> 2
If there are not a large number of distinct values of obj_id, try a SKIP_SCAN
hint. Otherwise, the secondary index should work, make sure it's actually used
via explain. Finally, you might try the ROW_TIMESTAMP feature if it fits your
use case.
> On Feb 22, 2017, at 11:30 PM, NaHeon Kim
Do an explain on your query to confirm that it's doing a full scan and not a
skip scan.
I typically use an in () clause instead of or, especially with compound keys. I
have also had to hint queries to use a skip scan, e.g /*+ SKIP_SCAN */.
Phoenix seems to do a very good job not reading data
I would try an array for that use case. From my experience in hbase for the
execution time querying the same data, more rows > more columns > fewer
columns. Also note that running the query in Phoenix it creates a plan every
time, and the number of columns might matter there. Also the sqlline
ersion of
> hbase and phoenix are you using?
>
>> On Mon, Dec 5, 2016 at 9:53 AM Jonathan Leech <jonat...@gmail.com> wrote:
>> Looks like PHOENIX-2357 introduced a memory leak, at least for me... I end
>> up with old gen filled up with objects - 100,000,000 instances e
Looks like PHOENIX-2357 introduced a memory leak, at least for me... I end up
with old gen filled up with objects - 100,000,000 instances each of
WeakReference and LinkedBlockingQueue$Node, owned by
ConnectionQueryServicesImpl.connectionsQueue. The PhoenixConnection referred to
by the
This would be really useful. The use case I have that is similar is to map
Phoenix data to Hive (but the subset of Hive that Impala understands). I
imagine it could work by reading the System.catalog table, or connection
metadata, and generating Hive create table statements. There would need to
I think you're best off running DDL with a new table name, but you could
probably upsert the values yourself into system.catalog. If you have a lot of
data to copy, you can use hbase snapshots and restore into the new table name.
This would also take care of creating the underlying hbase table,
The direct hbase client probably made 500 direct clients whereas Phoenix maybe
made fewer simultaneous calls, with a little waiting and hit a sweeter spot for
load on your configuration.
> On Sep 2, 2016, at 7:06 PM, Mujtaba Chohan wrote:
>
> Single user average:
If the table is small, you can export to a flat file, copy it over, then
import, all using Phoenix cmd line utilities.
If there is connectivity between the clusters, and the schema is identical, for
small to mid-size tables, you can set up hbase replication, and do upsert into
x select * from
!set maxWidth 2000 (or something like that, check the help)
You can also set your terminal really wide prior to launching sqlline.
> On Apr 4, 2016, at 1:30 PM, Ian Maloney wrote:
>
> Using Phoenix 4.4.0 to query a view created on an HBase 1.1.2 table, using
>
be some internal state in the region
server coprocessors that wouldn't be there unless the DDL is run in the
cluster. Would like to avoid an hbase restart in the replica cluster.
Thanks,
Jonathan
> On Feb 29, 2016, at 5:09 PM, Jonathan Leech <jonat...@gmail.com> wrote:
>
Some are and some aren't working... Version is
4.5.2-1.clabs_phoenix1.2.0.p0.774 on CDH5.5.1. Tried rebuilding on destination,
then on both sides, the doing snapshots to transfer the data, all to no avail.
The data replicates but Phoenix doesn't see it. I don't see any obvious
differences on
with all fields made static, and then copy the data from one to the other.
>
> Thanks,
> Steve
>
>> On Wed, Feb 24, 2016 at 7:11 PM, Jonathan Leech <jonat...@gmail.com> wrote:
>> You could also take a snapshot in hbase just prior to the drop table, then
>> res
You could also take a snapshot in hbase just prior to the drop table, then
restore it afterward.
> On Feb 24, 2016, at 12:25 PM, Steve Terrell wrote:
>
> Thanks for your quick and accurate responses!
>
>> On Wed, Feb 24, 2016 at 1:18 PM, Ankit Singhal
.
> On Feb 15, 2016, at 12:25 PM, Andrew Purtell <andrew.purt...@gmail.com> wrote:
>
> You might also consider moving back down to 7u79
>
>> On Feb 15, 2016, at 10:35 AM, Jonathan Leech <jonat...@gmail.com> wrote:
>>
>> Has anyone else seen this? Happ
Has anyone else seen this? Happening under load in jdk 1.7.0_80 / phoenix
4.5.2 - cloudera labs. Based on the source code, It seems the JVM is
calling the wrong toObject(), and then dumping. The correct toObject()
method is a couple parent classes away with some generics and Sun / Oracle
must have
35 matches
Mail list logo