[ANNOUNCE] Apache Phoenix 4.14 released

2018-06-11 Thread James Taylor
The Apache Phoenix team is pleased to announce the immediate availability of the 4.14.0 release. Apache Phoenix enables SQL-based OLTP and operational analytics for Apache Hadoop using Apache HBase as its backing store and providing integration with other projects in the Apache ecosystem such as

Re: [ANNOUNCE] Apache Phoenix 4.13.2 for CDH 5.11.2 released

2018-01-20 Thread James Taylor
On Sat, Jan 20, 2018 at 12:29 PM Pedro Boado wrote: > The Apache Phoenix team is pleased to announce the immediate availability > of the 4.13.2 release for CDH 5.11.2. Apache Phoenix enables SQL-based OLTP > and operational analytics for Apache Hadoop using Apache HBase as

Re: [ANNOUNCE] Apache Phoenix 4.13 released

2017-11-19 Thread James Taylor
/70cffa798d5f21ef87b02e07aeca8c7982b0b30251411b7be17fadf9@%3Cdev.phoenix.apache.org%3E On Sun, Nov 19, 2017 at 12:23 PM, Kumar Palaniappan < kpalaniap...@marinsoftware.com> wrote: > Are there any plans to release Phoenix 4.13 compatible with HBase 1.2? > > On Sat, Nov 11, 2017 at 5:57 PM, James T

[ANNOUNCE] Apache Phoenix 4.13 released

2017-11-11 Thread James Taylor
The Apache Phoenix team is pleased to announce the immediate availability of the 4.13.0 release. Apache Phoenix enables SQL-based OLTP and operational analytics for Apache Hadoop using Apache HBase as its backing store and providing integration with other projects in the Apache ecosystem such as

[ANNOUNCE] Apache Phoenix 4.12 released

2017-10-11 Thread James Taylor
The Apache Phoenix team is pleased to announce the immediate availability of the 4.12.0 release [1]. Apache Phoenix enables SQL-based OLTP and operational analytics for Apache Hadoop using Apache HBase as its backing store and providing integration with other projects in the Apache ecosystem such

[ANNOUNCE] Apache Phoenix 4.11 released

2017-07-07 Thread James Taylor
The Apache Phoenix team is pleased to announce the immediate availability of the 4.11.0 release. Apache Phoenix enables SQL-based OLTP and operational analytics for Apache Hadoop using Apache HBase as its backing store and providing integration with other projects in the Apache ecosystem such as

[ANNOUNCE] Apache Phoenix 4.10 released

2017-03-23 Thread James Taylor
The Apache Phoenix team is pleased to announce the immediate availability of the 4.10.0 release. Apache Phoenix enables SQL-based OLTP and operational analytics for Hadoop using Apache HBase as its backing store and providing integration with other projects in the ecosystem such as Spark, Hive,

[ANNOUNCE] PhoenixCon 2017 is a go!

2017-03-15 Thread James Taylor
I'm excited to announce that the 2nd Annual Apache Phoenix conference, PhoenixCon 2017 will take place the day after HBaseCon in San Francisco on Tuesday, June 13th from 10:30am-6pm. For more details, including to RSVP and submit a talk proposal, click here:

[ANNOUNCE] Apache Phoenix 4.9 released

2016-12-01 Thread James Taylor
Apache Phoenix enables OLTP and operational analytics for Apache Hadoop through SQL support using Apache HBase as its backing store and providing integration with other projects in the ecosystem such as Apache Spark, Apache Hive, Apache Pig, Apache Flume, and Apache MapReduce. We're pleased to

[ANNOUNCE] PhoenixCon the day after HBaseCon

2016-05-19 Thread James Taylor
The inaugural PhoenixCon will take place 9am-1pm on Wed, May 25th (at Salesforce @ 1 Market St, SF), the day after HBaseCon. We'll have two tracks: one for Apache Phoenix use cases and one for Apache Phoenix internals. To RSVP and for more details see here[1]. We hope you can make it! James

Re: [ANNOUNCE] PhoenixCon 2016 on Wed, May 25th 9am-1pm

2016-04-27 Thread James Taylor
ement for final approval. I am assuming > there is still a slot for my talk in use case srction. I should go ahead > with my approval process. Correct? > > Thanks, > Anil Gupta > Sent from my iPhone > > > On Apr 26, 2016, at 5:56 PM, James Taylor <jamestay...@apache.org &

[ANNOUNCE] PhoenixCon 2016 on Wed, May 25th 9am-1pm

2016-04-26 Thread James Taylor
We invite you to attend the inaugural PhoenixCon on Wed, May 25th 9am-1pm (the day after HBaseCon) hosted by Salesforce.com in San Francisco. There will be two tracks: one for use cases and one for internals. Drop me a note if you're interested in giving a talk. To RSVP and for more details, see

[ANNOUNCE] Apache Phoenix 4.5 released

2015-08-05 Thread James Taylor
The Apache Phoenix team is pleased to announce the immediate availability of the 4.5 release with support for HBase 0.98/1.0/1.1. Together with the 4.4 release, highlights include: Spark Integration (4.4) [1] User Defined Functions (4.4) [2] Query Server with thin driver (4.4) [3] Pherf tool for

[ANNOUNCE] Apache Phoenix 4.3 released

2015-02-25 Thread James Taylor
The Apache Phoenix team is pleased to announce the immediate availability of the 4.3 release. Highlights include: - functional indexes [1] - map-reduce over Phoenix tables [2] - cross join support [3] - query hint to force index usage [4] - set HBase properties through ALTER TABLE - ISO-8601 date

[ANNOUNCE] Apache Phoenix meetup in SF on Tue, Feb 24th

2015-01-22 Thread James Taylor
I'm excited to announce the first ever Apache Phoenix meetup, hosted by salesforce.com in San Francisco on Tuesday, February 24th @ 6pm. More details here: http://www.meetup.com/San-Francisco-Apache-Phoenix-Meetup/events/220009583/ Please ping me if you're interested in presenting your companies

[ANNOUNCE] Apache Phoenix 4.2.2 and 3.2.2 released

2014-12-10 Thread James Taylor
The Apache Phoenix team is pleased to announce the immediate availability of the 4.2.2/3.2.2 release. For details of the release, see our release announcement[1]. The Apache Phoenix team [1] https://blogs.apache.org/phoenix/entry/announcing_phoenix_4_2_2

Re: Connecting Hbase to Elasticsearch with Phoenix

2014-09-10 Thread James Taylor
+1. Thanks, Alex. I added a blog pointing folks there as well: https://blogs.apache.org/phoenix/entry/connecting_hbase_to_elasticsearch_through On Wed, Sep 10, 2014 at 2:12 PM, Andrew Purtell apurt...@apache.org wrote: Thanks for writing in with this pointer Alex! On Wed, Sep 10, 2014 at 11:11

[ANNOUNCE] Apache Phoenix 3.1 and 4.1 released

2014-09-01 Thread James Taylor
Hello everyone, On behalf of the Apache Phoenix [1] project, a SQL database on top of HBase, I'm pleased to announce the immediate availability of our 3.1 and 4.1 releases [2]. These include many bug fixes along with support for nested/derived tables, tracing, and local indexing. For details of

Re: Region not assigned

2014-08-14 Thread James Taylor
On the first connection to the cluster when you've installed Phoenix 2.2.3 and were previously using Phoenix 2.2.2, Phoenix will upgrade your Phoenix tables to use the new coprocessor names (org.apache.phoenix.*) instead of the old coprocessor names (com.salesforce.phoenix.*). Thanks, James On

Re: Copy some records from Huge hbase table to another table

2014-05-23 Thread James Taylor
Hi Riyaz, You can do this with a single SQL command using Apache Phoenix, a SQL engine on top of HBase, and you'll get better performance than if you hand coded it using the HBase client APIs. Depending on your current schema, you may be able to run this command with no change to your data. Let's

[ANNOUNCE] Apache Phoenix has graduated as a top level project

2014-05-22 Thread James Taylor
I'm pleased to announce that Apache Phoenix has graduated from the incubator to become a top level project. Thanks so much for all your help and support - we couldn't have done it without the fantastic HBase community! We're looking forward to continued collaboration. Regards, The Apache Phoenix

Re: hbase key design to efficient query on base of 2 or more column

2014-05-19 Thread James Taylor
If you use Phoenix, queries would leverage our Skip Scan: http://phoenix-hbase.blogspot.com/2013/05/demystifying-skip-scan-in-phoenix.html Assuming a row key made up of a low cardinality first value (like a byte representing an enum), followed by a high cardinality second value (like a date/time

Re: Questions on FuzzyRowFilter

2014-05-18 Thread James Taylor
in the first place and just store the index… ;-) (Yes, I thought about this too.) -Mike On May 16, 2014, at 7:50 PM, James Taylor jtay...@salesforce.com wrote: Hi Mike, I agree with you - the way you've outlined is exactly the way Phoenix has implemented it. It's a bit of a problem

Re: Questions on FuzzyRowFilter

2014-05-18 Thread James Taylor
when you say salt. On May 18, 2014, at 7:16 PM, James Taylor jtay...@salesforce.com wrote: @Mike, The biggest problem is you're not listening. Please actually read my response (and you'll understand the what we're calling salting is not a random seed). Phoenix already has secondary

Re: Questions on FuzzyRowFilter

2014-05-18 Thread James Taylor
://phoenix.incubator.apache.org/secondary_indexing.html Thanks, James On Sun, May 18, 2014 at 1:56 PM, James Taylor jtay...@salesforce.comwrote: The top two hits when you Google for HBase salt are - Sematext blog describing salting as I described it in my email - Phoenix blog again describing salting

Re: Prefix salting pattern

2014-05-18 Thread James Taylor
@Software Dev - might be feasible to implement a Thrift client that speaks Phoenix JDBC. I believe this is similar to what Hive has done. Thanks, James On Sun, May 18, 2014 at 1:19 PM, Mike Axiak m...@axiak.net wrote: In our measurements, scanning is improved by performing against n range

Re: Prefix salting pattern

2014-05-17 Thread James Taylor
No, there's nothing wrong with your thinking. That's exactly what Phoenix does - use the modulo of the hash of the key. It's important that you can calculate the prefix byte so that you can still do fast point lookups. Using a modulo that's bigger than the number of region servers can make sense

Re: Questions on FuzzyRowFilter

2014-05-16 Thread James Taylor
Hi Mike, I agree with you - the way you've outlined is exactly the way Phoenix has implemented it. It's a bit of a problem with terminology, though. We call it salting: http://phoenix.incubator.apache.org/salted.html. We hash the key, mod the hash with the SALT_BUCKET value you provide, and

Re: How to implement sorting in HBase scans for a particular column

2014-04-29 Thread James Taylor
Hi Vikram, I see you sent the Phoenix mailing list back in Dec a question on how to use Phoenix 2.1.2 with Hadoop 2 for HBase 0.94. Looks like you were having trouble building Phoenix with the hadoop2 profile. In our 3.0/4.0 we bundle the phoenix jars pre-built with both hadoop1 and hadoop2, so

Re: How to get specified rows and avoid full table scanning?

2014-04-21 Thread James Taylor
Tao, Just wanted to give you a couple of relevant pointers to Apache Phoenix for your particular problem: - Preventing hotspotting by salting your table: http://phoenix.incubator.apache.org/salted.html - Pig Integration for your map/reduce job:

[ANNOUNCE] Apache Phoenix releases next major version

2014-04-12 Thread James Taylor
The Apache Phoenix team is pleased to announce the release of its next major version (3.0 and 4.0) from the Apache Incubator. Phoenix is a SQL query engine for Apache HBase, a NoSQL data store. It is accessed as a JDBC driver and enables querying and managing HBase tables using SQL. Major new

Re: [VOTE] The 4th HBase 0.98.1 release candidate (RC3) is available for download

2014-04-03 Thread James Taylor
I implore you to stick with releasing RC3. Phoenix 4.0 has no release it can currently run on. Phoenix doesn't use SingleColumnValueFilter, so it seems that HBASE-10850 has no impact wrt Phoenix. Can't we get these additional bugs in 0.98.2 - it's one month away [1]? James [1]

Re: [VOTE] The 4th HBase 0.98.1 release candidate (RC3) is available for download

2014-04-03 Thread James Taylor
am find with giving the next RC a bit shorter voting period. Cheers On Thu, Apr 3, 2014 at 8:57 AM, James Taylor jtay...@salesforce.com wrote: I implore you to stick with releasing RC3. Phoenix 4.0 has no release it can currently run on. Phoenix doesn't use SingleColumnValueFilter, so

Re: [VOTE] The 4th HBase 0.98.1 release candidate (RC3) is available for download

2014-04-03 Thread James Taylor
a definitive statement on if a critical/blocker bug exists for Phoenix or not? If not, we have sufficient votes at this point to carry the RC and can go forward with the release at the end of the vote period. On Apr 3, 2014, at 5:57 PM, James Taylor jtay...@salesforce.com wrote: I implore

Re: how to reverse an integer for rowkey?

2014-03-27 Thread James Taylor
Another option is to use Apache Phoenix and let it do these things for you: CREATE TABLE my_table( intField INTEGER, strField VARCHAR, CONSTRAINT pk PRIMARY KEY (intField DESC, strField)); Thanks, James @JamesPlusPlus http://phoenix.incubator.apache.org/ On Thu, Mar

Re: Filters failing to compare negative numbers (int,float,double or long)

2014-03-19 Thread James Taylor
Another option is to use Apache Phoenix ( http://phoenix.incubator.apache.org/) as it takes care of all these details for you automatically. Cheers, James On Wed, Mar 19, 2014 at 7:49 AM, Ted Yu yuzhih...@gmail.com wrote: In 0.96+, extensible data type API is provided. Please take a look at

Re: org.apache.hadoop.hbase.ipc.SecureRpcEngine class not found in HBase jar

2014-03-04 Thread James Taylor
Let's just target your patch for the Phoenix 4.0 release so we can rely on Maven having what we need. Thanks, James On Tue, Mar 4, 2014 at 11:29 AM, anil gupta anilgupt...@gmail.com wrote: Phoenix refers to maven artifact of HBase. If its not in Maven repo of HBase then either we add the

Re: HBase Schema for IPTC News ML G2

2014-03-03 Thread James Taylor
Hi Jigar, Take a look at Apache Phoenix: http://phoenix.incubator.apache.org/ It allows you to use SQL to query over your HBase data and supports composite primary keys, so you could create a schema like this: create table news_message(guid varchar not null, version bigint not null,

Re: creating tables from mysql to hbase

2014-02-18 Thread James Taylor
Hi Jignesh, Phoenix has support for multi-tenant tables: http://phoenix.incubator.apache.org/multi-tenancy.html. Also, your primary key constraint would transfer over as-is, since Phoenix supports composite row keys. Essentially your pk constraint values get concatenated together to form your row

Re: HBase load distribution vs. scan efficiency

2014-01-20 Thread James Taylor
Hi William, Phoenix uses this bucket mod solution as well ( http://phoenix.incubator.apache.org/salted.html). For the scan, you have to run it in every possible bucket. You can still do a range scan, you just have to prepend the bucket number to the start/stop key of each scan you do, and then you

Re: HBase load distribution vs. scan efficiency

2014-01-20 Thread James Taylor
, Jan 20, 2014 at 8:15 PM, James Taylor jtay...@salesforce.com wrote: Hi William, Phoenix uses this bucket mod solution as well ( http://phoenix.incubator.apache.org/salted.html). For the scan, you have to run it in every possible bucket. You can still do a range scan, you just have

Re: Question on efficient, ordered composite keys

2014-01-14 Thread James Taylor
Hi Henning, My favorite implementation of efficient composite row keys is Phoenix. We support composite row keys whose byte representation sorts according to the natural sort order of the values (inspired by Lily). You can use our type system independent of querying/inserting data with Phoenix,

Re: use hbase as distributed crawl's scheduler

2014-01-04 Thread James Taylor
? On Sat, Jan 4, 2014 at 3:43 PM, James Taylor jtay...@salesforce.com wrote: Hi LiLi, Phoenix isn't an experimental project. We're on our 2.2 release, and many companies (including the company for which I'm employed, Salesforce.com) use it in production today. Thanks, James

Re: use hbase as distributed crawl's scheduler

2014-01-03 Thread James Taylor
in your cluster. You can read more about salting here: http://phoenix.incubator.apache.org/salted.html On Thu, Jan 2, 2014 at 11:36 PM, Li Li fancye...@gmail.com wrote: thank you. it's great. On Fri, Jan 3, 2014 at 3:15 PM, James Taylor jtay...@salesforce.com wrote: Hi LiLi, Have a look

Re: use hbase as distributed crawl's scheduler

2014-01-03 Thread James Taylor
do parallel scans for each bucket and do a merge sort on the client, so the cost is pretty low for this (we also provide a way of turning this off if your use case doesn't need it). Two years, JM? Now you're really going to have to start using Phoenix :-) On Friday, January 3, 2014, James Taylor

Re: secondary index feature

2014-01-03 Thread James Taylor
love you see if your implementation can fit into the framework we wrote - we would be happy to work to see if it needs some more hooks or modifications - I have a feeling this is pretty much what you guys will need -Jesse On Mon, Dec 23, 2013 at 10:01 AM, James Taylor jtay

Re: use hbase as distributed crawl's scheduler

2014-01-03 Thread James Taylor
great but it's now only a experimental project. I want to use only hbase. could you tell me the difference of Phoenix and hbase? If I use hbase only, how should I design the schema and some extra things for my goal? thank you On Sat, Jan 4, 2014 at 3:41 AM, James Taylor jtay...@salesforce.com

Re: use hbase as distributed crawl's scheduler

2014-01-02 Thread James Taylor
Otis, I didn't realize Nutch uses HBase underneath. Might be interesting if you serialized data in a Phoenix-compliant manner, as you could run SQL queries directly on top of it. Thanks, James On Thu, Jan 2, 2014 at 10:17 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Have a

Re: use hbase as distributed crawl's scheduler

2014-01-02 Thread James Taylor
Hi LiLi, Have a look at Phoenix (http://phoenix.incubator.apache.org/). It's a SQL skin on top of HBase. You can model your schema and issue your queries just like you would with MySQL. Something like this: // Create table that optimizes for your most common query // (i.e. the PRIMARY KEY

Re: secondary index feature

2013-12-23 Thread James Taylor
Henning, Jesse Yates wrote the back-end of our global secondary indexing system in Phoenix. He designed it as a separate, pluggable module with no Phoenix dependencies. Here's an overview of the feature: https://github.com/forcedotcom/phoenix/wiki/Secondary-Indexing. The section that discusses the

Re: Performance tuning

2013-12-21 Thread James Taylor
FYI, scanner caching defaults to 1000 in Phoenix, but as folks have pointed out, that's not relevant in this case b/c only a single row is returned from the server for a COUNT(*) query. On Sat, Dec 21, 2013 at 2:51 PM, Kristoffer Sjögren sto...@gmail.comwrote: Yeah, im doing a count(*) query

Re: Errors :Undefined table and DoNotRetryIOException while querying from phoenix to hbase

2013-12-14 Thread James Taylor
Mathan, We already answered your question on the Phoenix mailing list. If you have a follow up question, please post it there. This is not an HBase issue. Thanks, James On Dec 14, 2013, at 2:10 PM, mathan kumar immathanku...@gmail.com wrote: -- Forwarded message -- From: x

[ANNOUNCE] Phoenix accepted as Apache incubator

2013-12-13 Thread James Taylor
The Phoenix team is pleased to announce that Phoenix[1] has been accepted as an Apache incubator project[2]. Over the next several weeks, we'll move everything over to Apache and work toward our first release. Happy to be part of the extended family. Regards, James [1]

Re: Online/Realtime query with filter and join?

2013-12-02 Thread James Taylor
I agree with Doug Meil's advice. Start with your row key design. In Phoenix, your PRIMARY KEY CONSTRAINT defines your row key. You should lead with the columns that you'll filter against most frequently. Then, take a look at adding secondary indexes to speedup queries against other columns.

Re: HBase Phoenix questions

2013-11-27 Thread James Taylor
Amit, So sorry we didn't answer your question before - I'll post an answer now over on our mailing list. Thanks, James On Wed, Nov 27, 2013 at 8:46 AM, Amit Sela am...@infolinks.com wrote: I actually asked some of these questions in the phoenix-hbase-user googlegroup but never got an

Re: How to get Metadata information in Hbase

2013-11-25 Thread James Taylor
One other tool option for you is to use Phoenix. You use SQL to create a table and define the columns through standard DDL. Your columns make up the allowed KeyValues for your table and the metadata is surfaced through the standard JDBC metadata APIs (with column family mapping to table catalog).

Re: HFile block size

2013-11-25 Thread James Taylor
FYI, you can define BLOCKSIZE in your hbase-sites.xml, just like with HBase to make it global. Thanks, James On Mon, Nov 25, 2013 at 9:08 PM, Azuryy Yu azury...@gmail.com wrote: This is no way to declare global property in Phoneix, you have to declare BLOCKSIZE in each 'create' SQL. such

Re: hbase suitable for churn analysis ?

2013-11-14 Thread James Taylor
We ingest logs using Pig to write Phoenix-compliant HFiles, load those into HBase and then use Phoenix (https://github.com/forcedotcom/phoenix) to query directly over the HBase data through SQL. Regards, James On Thu, Nov 14, 2013 at 9:35 AM, sam wu swu5...@gmail.com wrote: we ingest data

Re: HBASE help

2013-10-28 Thread James Taylor
Take a look at Phoenix (https://github.com/forcedotcom/phoenix) which will allow you to issue SQL to directly create tables, insert data, and run queries over HBase using the data model described below. Thanks, James On Oct 28, 2013, at 8:47 AM, saiprabhur saiprab...@gmail.com wrote: Hi Folks,

Re: [ANNOUNCE] Phoenix v 2.1 released

2013-10-28 Thread James Taylor
. as fast or faster than a batched get). Thanks, James On Mon, Oct 28, 2013 at 11:14 AM, Asaf Mesika asaf.mes...@gmail.com wrote: I couldn't get the Row Value Constructor feature. Do you perhaps have a real world use case to demonstrate this? On Friday, October 25, 2013, James Taylor wrote

[ANNOUNCE] Phoenix v 2.1 released

2013-10-24 Thread James Taylor
The Phoenix team is pleased to announce the immediate availability of Phoenix 2.1 [1]. More than 20 individuals contributed to the release. Here are some of the new features now available: * Secondary Indexing [2] to create and automatically maintain global indexes over your primary table. -

Re: [ANNOUNCE] Phoenix v 2.1 released

2013-10-24 Thread James Taylor
yuzhih...@gmail.com wrote: From https://github.com/forcedotcom/phoenix/wiki/Secondary-Indexing : Is date_col a column from data table ? CREATE INDEX my_index ON my_table (date_col DESC, v1) INCLUDE (v3) SALT_BUCKETS=10, DATA_BLOCK_ENCODING='NONE'; On Thu, Oct 24, 2013 at 5:24 PM, James

Re: row filter - binary comparator at certain range

2013-10-21 Thread James Taylor
Take a look at Phoenix(https://github.com/forcedotcom/phoenix). It supports both salting and fuzzy row filtering through its skip scan. On Sun, Oct 20, 2013 at 10:42 PM, Premal Shah premal.j.s...@gmail.comwrote: Have you looked at FuzzyRowFilter? Seems to me that it might satisfy your

Re: row filter - binary comparator at certain range

2013-10-21 Thread James Taylor
Phoenix restricts salting to a single byte. Salting perhaps is misnamed, as the salt byte is a stable hash based on the row key. Phoenix's skip scan supports sub-key ranges. We've found salting in general to be faster (though there are cases where it's not), as it ensures better parallelization.

Re: row filter - binary comparator at certain range

2013-10-21 Thread James Taylor
this is the base access pattern. HTH -Mike On Oct 21, 2013, at 11:37 AM, James Taylor jtay...@salesforce.com wrote: Phoenix restricts salting to a single byte. Salting perhaps is misnamed, as the salt byte is a stable hash based on the row key. Phoenix's skip scan supports sub-key

Re: row filter - binary comparator at certain range

2013-10-21 Thread James Taylor
of your regions will be 1/2 the max size… but the size you really want and 8-16 regions will be up to twice as big. On Oct 21, 2013, at 3:26 PM, James Taylor jtay...@salesforce.com wrote: What do you think it should be called, because prepending-row-key-with-single-hashed-byte doesn't have

Re: row filter - binary comparator at certain range

2013-10-21 Thread James Taylor
to, so you end up with all regions half filled except for the last region in each 'modded' value. I wouldn't say its a bad thing if you plan for it. On Oct 21, 2013, at 5:07 PM, James Taylor jtay...@salesforce.com wrote: We don't truncate the hash, we mod it. Why would you expect that data

Re: Write TimeSeries Data and Do Time Based Range Scans

2013-09-24 Thread James Taylor
Hey Anil, The solution you've described is the best we've found for Phoenix (inspired by the work of Alex at Sematext). You can do all of this in a few lines of SQL: CREATE TABLE event_data( who VARCHAR, type SMALLINT, id BIGINT, when DATE, payload VARBINARY CONSTRAINT pk PRIMARY KEY

Re: deploy saleforce phoenix coprocessor to hbase/lib??

2013-09-11 Thread James Taylor
/lib? Our customers said it has to. But I feel it is unnecessary and weird. Can you confirm? Thanks Tian-Ying -Original Message- From: James Taylor [mailto:jtay...@salesforce.com] Sent: Tuesday, September 10, 2013 4:40 PM To: user@hbase.apache.org Subject: Re: deploy saleforce

Re: deploy saleforce phoenix coprocessor to hbase/lib??

2013-09-10 Thread James Taylor
When a table is created with Phoenix, its HBase table is configured with the Phoenix coprocessors. We do not specify a jar path, so the Phoenix jar that contains the coprocessor implementation classes must be on the classpath of the region server. In addition to coprocessors, Phoenix relies on

Re: 答复: Fastest way to get count of records in huge hbase table?

2013-09-10 Thread James Taylor
Use Phoenix (https://github.com/forcedotcom/phoenix) by doing the following: CREATE VIEW myHTableName (key VARBINARY NOT NULL PRIMARY KEY); SELECT COUNT(*) FROM myHTableName; As fenghong...@xiaomi.com said, you still need to scan the table, but Phoenix will do it in parallel and use a coprocessor

Re: Concurrent connections to Hbase

2013-09-05 Thread James Taylor
Hey Kiru, The Phoenix team would be happy to work with you to benchmark your performance if you can give us specifics about your schema design, queries, and data sizes. We did something similar for Sudarshan for a Bloomberg use case here[1]. Thanks, James [1].

Re: HBase - stable versions

2013-09-04 Thread James Taylor
+1 to what Nicolas said. That goes for Phoenix as well. It's open source too. We do plan to port to 0.96 when our user community (Salesforce.com, of course, being one of them) demands it. Thanks, James On Wed, Sep 4, 2013 at 10:11 AM, Nicolas Liochon nkey...@gmail.com wrote: It's open

Re: how to export data from hbase to mysql?

2013-08-27 Thread James Taylor
Or if you'd like to be able to use SQL directly on it, take a look at Phoenix (https://github.com/forcedotcom/phoenix). James On Aug 27, 2013, at 8:14 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Take a look at sqoop? Le 2013-08-27 23:08, ch huang justlo...@gmail.com a écrit :

Re: Client Get vs Coprocessor scan performance

2013-08-19 Thread James Taylor
it). Is there a way to do a sort of user defined function on a column ? that would take care of my calculation on that double. Thanks again. Regards, - kiru Kiru Pakkirisamy | webcloudtech.wordpress.com From: James Taylor jtay...@salesforce.com

Re: Client Get vs Coprocessor scan performance

2013-08-18 Thread James Taylor
Would be interesting to compare against Phoenix's Skip Scan (http://phoenix-hbase.blogspot.com/2013/05/demystifying-skip-scan-in-phoenix.html) which does a scan through a coprocessor and is more than 2x faster than multi Get (plus handles multi-range scans in addition to point gets). James On

Re: Client Get vs Coprocessor scan performance

2013-08-18 Thread James Taylor
), I will try to bench mark this table alone against Phoenix on another cluster. Thanks. Regards, - kiru Kiru Pakkirisamy | webcloudtech.wordpress.com From: James Taylor jtay...@salesforce.com To: user@hbase.apache.org user@hbase.apache.org Cc: Kiru

Re: Client Get vs Coprocessor scan performance

2013-08-18 Thread James Taylor
-- *From:* James Taylor jtay...@salesforce.com *To:* user@hbase.apache.org; Kiru Pakkirisamy kirupakkiris...@yahoo.com *Sent:* Sunday, August 18, 2013 2:07 PM *Subject:* Re: Client Get vs Coprocessor scan performance Kiru, If you're able to post the key values, row key

Re: [ANNOUNCE] Secondary Index in HBase - from Huawei

2013-08-13 Thread James Taylor
Fantastic! Let me know if you're up for surfacing this through Phoenix. Regards, James On Tue, Aug 13, 2013 at 7:48 AM, Anil Gupta anilgupt...@gmail.com wrote: Excited to see this! Best Regards, Anil On Aug 13, 2013, at 6:17 AM, zhzf jeff jeff.z...@gmail.com wrote: very google local

Re: Client Get vs Coprocessor scan performance

2013-08-12 Thread James Taylor
Hey Kiru, Another option for you may be to use Phoenix ( https://github.com/forcedotcom/phoenix). In particular, our skip scan may be what you're looking for: http://phoenix-hbase.blogspot.com/2013/05/demystifying-skip-scan-in-phoenix.html. Under-the-covers, the skip scan is doing a series of

Re: Help in designing row key

2013-07-03 Thread James Taylor
Hi Flavio, Have you had a look at Phoenix (https://github.com/forcedotcom/phoenix)? It will allow you to model your multi-part row key like this: CREATE TABLE flavio.analytics ( source INTEGER, type INTEGER, qual VARCHAR, hash VARCHAR, ts DATE CONSTRAINT pk PRIMARY KEY

Re: Help in designing row key

2013-07-03 Thread James Taylor
to have balanced regions as much as possible. So I think that in this case I will still use Bytes concatenation if someone confirm I'm doing it in the right way. On Wed, Jul 3, 2013 at 12:33 PM, James Taylor jtay...@salesforce.comwrote: Hi Flavio, Have you had a look at Phoenix (https

Re: Schema design for filters

2013-06-27 Thread James Taylor
Hi Kristoffer, Have you had a look at Phoenix (https://github.com/forcedotcom/phoenix)? You could model your schema much like an O/R mapper and issue SQL queries through Phoenix for your filtering. James @JamesPlusPlus http://phoenix-hbase.blogspot.com On Jun 27, 2013, at 4:39 PM, Kristoffer

Re: HBase: Filters not working for negative integers

2013-06-26 Thread James Taylor
You'll need to flip the sign bit for ints and longs like Phoenix does. Feel free to borrow our serializers (in PDataType) or just use Phoenix. Thanks, James On 06/26/2013 12:16 AM, Madhukar Pandey wrote: Please ignore my previous mail..there was some copy paste issue in it.. this is the

Re: Scan performance

2013-06-22 Thread James Taylor
Hi Tony, Have you had a look at Phoenix(https://github.com/forcedotcom/phoenix), a SQL skin over HBase? It has a skip scan that will let you model a multi part row key and skip through it efficiently as you've described. Take a look at this blog for more info:

Re: querying hbase

2013-06-01 Thread James Taylor
...@apache.orgjavascript:; wrote: On Thu, May 23, 2013 at 5:10 PM, James Taylor jtay...@salesforce.comjavascript:; wrote: Has there been any discussions on running the HBase server in an OSGi container? I believe the only discussions have been on avoiding talk about coprocessor reloading

Re: querying hbase

2013-05-31 Thread James Taylor
On 05/24/2013 02:50 PM, Andrew Purtell wrote: On Thu, May 23, 2013 at 5:10 PM, James Taylor jtay...@salesforce.comwrote: Has there been any discussions on running the HBase server in an OSGi container? I believe the only discussions have been on avoiding talk about coprocessor reloading

Re: Couting number of records in a HBase table

2013-05-28 Thread James Taylor
Another option is Phoenix (https://github.com/forcedotcom/phoenix), where you'd do SELECT count(*) FROM my_table Regards, James On 05/28/2013 03:25 PM, Ted Yu wrote: Take a look at http://hbase.apache.org/book.html#rowcounter Cheers On Tue, May 28, 2013 at 3:23 PM, Shahab Yunus

Re: querying hbase

2013-05-23 Thread James Taylor
I did not try Phoenix yet, but I think you need to upload the JAR on all the region servers first, and then restart them, right? People might not have the rights to do that. That's why I thought Pheonix was overkill regarding the need to just list a table content on a screen. JM 2013/5/22 James

Re: querying hbase

2013-05-22 Thread James Taylor
Hey JM, Can you expand on what you mean? Phoenix is a single jar, easily deployed to any HBase cluster. It can map to existing HBase tables or create new ones. It allows you to use SQL (a fairly popular language) to query your data, and it surfaces it's functionality as a JDBC driver so that

Re: querying hbase

2013-05-22 Thread James Taylor
Hi Aji, With Phoenix, you pass through the client port in your connection string, so this would not be an issue. If you're familiar with SQL Developer, then Phoenix supports something similar with SQuirrel: https://github.com/forcedotcom/phoenix#sql-client Regards, James On 05/22/2013 07:42

Re: [ANNOUNCE] Phoenix 1.2 is now available

2013-05-20 Thread James Taylor
give you a bit more detail. Regards, James On 05/20/2013 04:07 AM, Azuryy Yu wrote: why off-list? it would be better share here. --Send from my Sony mobile. On May 18, 2013 12:14 AM, James Taylor jtay...@salesforce.com wrote: Anil, Yes, everything is in the Phoenix GitHub repo. Will give you

Re: Some Hbase questions

2013-05-19 Thread James Taylor
Hi Vivek, Take a look at the SQL skin for HBase called Phoenix (https://github.com/forcedotcom/phoenix). Instead of using the native HBase client, you use regular JDBC and Phoenix takes care of making the native HBase calls for you. We support composite row keys, so you could form your row

Re: [ANNOUNCE] Phoenix 1.2 is now available

2013-05-17 Thread James Taylor
name/classes? I haven't got the opportunity to try out Phoenix yet but i would like to have a look at the implementation. Thanks, Anil Gupta On Thu, May 16, 2013 at 4:15 PM, James Taylor jtay...@salesforce.comwrote: Hi Anil, No HBase changes were required. We're already leveraging coprocessors

[ANNOUNCE] Phoenix 1.2 is now available

2013-05-16 Thread James Taylor
We are pleased to announce the immediate availability of Phoenix 1.2 (https://github.com/forcedotcom/phoenix/wiki/Download). Here are some of the release highlights: * Improve performance of multi-point and multi-range queries (20x plus) using new skip scan * Support TopN queries (3-70x

Re: [ANNOUNCE] Phoenix 1.2 is now available

2013-05-16 Thread James Taylor
similar stuff in https://issues.apache.org/jira/browse/HBASE-7474. I am interested in knowing the details about that implementation. Thanks, Anil Gupta On Thu, May 16, 2013 at 12:29 PM, James Taylor jtay...@salesforce.comwrote: We are pleased to announce the immediate availability of Phoenix 1.2

Re: Get all rows that DON'T have certain qualifiers

2013-05-14 Thread James Taylor
Hi Amit, Using Phoenix, the SQL skin over HBase (https://github.com/forcedotcom/phoenix), you'd do this: select * from myTable where value1 is null or value2 is null Regards, James http://phoenix-hbase.blogspot.com @JamesPlusPlus On May 14, 2013, at 6:56 AM, samar.opensource

Re: Coprocessors

2013-05-01 Thread James Taylor
Sudarshan, Below are the results that Mujtaba put together. He put together two version of your schema: one with the ATTRIBID as part of the row key and one with it as a key value. He also benchmarked the query time both when all of the data was in the cache versus when all of the data was read

Re: HBase and Datawarehouse

2013-04-30 Thread James Taylor
Phoenix will succeed if HBase succeeds. Phoenix just makes it easier to drive HBase to it's maximum capability. IMHO, if HBase is to make further gains in the OLAP space, scans need to be faster and new, more compressed columnar-store type block formats need to be developed. Running inside

Re: Read access pattern

2013-04-30 Thread James Taylor
bq. The downside that I see, is the bucket_number that we have to maintain both at time or reading/writing and update it in case of cluster restructuring. I agree that this maintenance can be painful. However, Phoenix (https://github.com/forcedotcom/phoenix) now supports salting, automating

  1   2   >