roadmap:
https://github.com/forcedotcom/phoenix/wiki#wiki-roadmap
We welcome feedback and contributions from the community to Phoenix and
look forward to working together.
Regards,
James Taylor
@JamesPlusPlus
...@mapbased.comwrote:
Great tool,I will try it later. thanks for sharing!
2013/1/31 Devaraj Das d...@hortonworks.com
Congratulations, James. We will surely benefit from this tool.
On Wed, Jan 30, 2013 at 1:04 PM, James Taylor jtay...@salesforce.com
wrote:
We are pleased to announce the immediate
If you run a SQL query that does aggregation (i.e. uses a built-in
aggregation function like COUNT or does a GROUP BY), Phoenix will
orchestrate the running of a set of queries in parallel, segmented along
your row key (driven by the start/stop key plus region boundaries). We
take advantage of
Another approach would be to use Phoenix
(http://github.com/forcedotcom/phoenix). You can model your schema as
you would in the relational world, but you get the horizontal
scalability of HBase.
James
On 02/06/2013 01:49 PM, Michael Segel wrote:
Overloading the time stamp aka the
Wanted to check with folks and see if they've seen an issue around this
before digging in deeper. I'm on 0.94.2. If I execute in parallel
multiple scans to different parts of the same region, they appear to be
processed serially. It's actually faster from the client side to execute
a single
(https://issues.apache.org/jira/browse/HBASE-7336).Fixed 0.94.4.
I assume you have enough handlers, etc. (i.e. does the same happen if issue
multiple scan request across different region of the same region server?)
-- Lars
From: James Taylor jtay
- Original Message -
From: James Taylor jtay...@salesforce.com
To: user@hbase.apache.org user@hbase.apache.org; lars hofhansl
la...@apache.org
Cc:
Sent: Friday, February 8, 2013 9:52 PM
Subject: Re: independent scans to same region processed serially
All data is the blockcache
Filed https://issues.apache.org/jira/browse/HBASE-7805
Test case attached
It occurs only if the table has a region observer coprocessor.
James
On 02/09/2013 11:04 AM, lars hofhansl wrote:
If I execute in parallel multiple scans to different parts of the same region,
they appear to be
In 0.94.2, if the coprocessor class was on the HBase classpath, then the
jarFilePath argument to HTableDescriptor.addCoprocessor seemed to
essentially be ignored - it didn't matter if the jar could be found or
not. In 0.94.4 we're getting an error if this is the case. Is there a
way to
IMO, I don't think it's safe to change the KV in-place. We always create a new
KV in our coprocessors.
James
On Feb 12, 2013, at 6:41 AM, Mesika, Asaf asaf.mes...@gmail.com wrote:
I'm seeing a very strange behavior:
If I run a scan during major compaction, I can see both the modified Delta
Hello,
Have you considered using Phoenix
(https://github.com/forcedotcom/phoenix) for this use case? Phoenix is a
SQL layer on top of HBase. For this use case, you'd connect to your
cluster like this:
Class.forName(com.salesforce.phoenix.jdbc.PhoenixDriver); // register
driver
Connection
spotting when using time as the key. Or the problem with always
adding data to the right of the last row.
The same would apply with the project id, assuming that it too is a number that
grows incrementally with each project.
On Feb 17, 2013, at 4:50 PM, James Taylor jtay...@salesforce.com wrote
Unless I'm doing something wrong, it looks like the Maven repository
(http://mvnrepository.com/artifact/org.apache.hbase/hbase) only contains
HBase up to 0.94.3. Is there a different repo I should use, or if not,
any ETA on when it'll be updated?
James
Same with us on Phoenix - we use the setAttribute on the client side and
the getAttribute on the server side to pickup state on the Scan being
executed. Works great. One thing to keep in mind, though: for a region
observer coprocessor, the state you set on the client side will be sent
to each
We are pleased to announce the immediate availability of Phoenix v 1.1,
with support for HBase v 0.94.4 and above. Phoenix is a SQL layer on top
of HBase. For details, see our announcement here:
http://phoenix-hbase.blogspot.com/2013/02/annoucing-phoenix-v-11-support-for.html
Thanks,
James
, Ted Yu yuzhih...@gmail.com wrote:
I ran test suite and they passed:
Tests run: 452, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] BUILD SUCCESS
Good job.
On Mon, Feb 25, 2013 at 9:35 AM, James Taylor jtay
., but it
illustrates the idea.
On 02/26/2013 09:59 AM, Ted Yu wrote:
In the first graph on the performance page, what does 'key filter'
represent ?
Thanks
On Tue, Feb 26, 2013 at 9:53 AM, James Taylor jtay...@salesforce.comwrote:
Both Phoenix and Impala provide SQL as a way to get at your data. Here
You can query existing tables if the data is serialized in the way that
Phoenix expects. For more detailed information and options, check out
my response to this issue:
https://github.com/forcedotcom/phoenix/issues/30 and check out our Data
Type language reference here:
Check your logs for whether your end-point coprocessor is hitting
zookeeper on every invocation to figure out the region start key.
Unfortunately (at least last time I checked), the default way of
invoking an end point coprocessor doesn't use the meta cache. You can go
through a combination of
Another possible solution for you: use Phoenix:
https://github.com/forcedotcom/phoenix
Phoenix would allow you to model your scenario using SQL through JDBC,
like this:
Connection conn = DriverManager.connect(jdbc:phoenix:your zookeeper
quorum);
Statement stmt = conn.createStatement(
Hi Nick,
What do you mean by hashing algorithms?
Thanks,
James
On 03/15/2013 10:11 AM, Nick Dimiduk wrote:
Hi David,
Native support for a handful of hashing algorithms has also been discussed.
Do you think these should be supported directly, as opposed to using a
fixed-length String or
Another one to add to your list:
6. Phoenix (https://github.com/forcedotcom/phoenix)
Thanks,
James
On Mar 20, 2013, at 2:50 AM, Vivek Mishra vivek.mis...@impetus.co.in wrote:
I have used Kundera, persistence overhead on HBase API is minimal considering
feature set available for use within
Mohith,
Are you wanting to reduce the amount of data you're scanning and bring
down your query time when:
- you have a row key has a multi-part row key of a string and time value and
- you know the prefix of the string and a range of the time value?
That's possible (but not easy) to do with
From the SQL perspective, handling null is important. Phoenix supports
null in the following way:
- the absence of a key value
- an empty value in a key value
- an empty value in a multi part row key
- for variable length types (VARCHAR and DECIMAL) a null byte
separator would be used if not
On 04/01/2013 04:41 PM, Nick Dimiduk wrote:
On Mon, Apr 1, 2013 at 4:31 PM, James Taylor jtay...@salesforce.com wrote:
From the SQL perspective, handling null is important.
From your perspective, it is critical to support NULLs, even at the expense
of fixed-width encodings at all
Hello,
We're doing some performance testing of the essential column family
feature, and we're seeing some performance degradation when comparing
with and without the feature enabled:
Performance of scan relative
% of rows selectedto not enabling the feature
Max Lapan tried to address has non essential column family
carrying considerably more data compared to essential column family.
Cheers
On Sat, Apr 6, 2013 at 11:05 PM, James Taylor jtay...@salesforce.comwrote:
Hello,
We're doing some performance testing of the essential column family
feature
. does
your filter utilize hint ?
It would be easier for me and other people to reproduce the issue you
experienced if you put your scenario in some test similar to
TestJoinedScanners.
Will take a closer look at the code Monday.
Cheers
On Sun, Apr 7, 2013 at 11:37 AM, James Taylor jtay
Hi Greame,
Are you familiar with Phoenix (https://github.com/forcedotcom/phoenix),
a SQL skin over HBase? We've just introduced a new feature (still in the
master branch) that'll do what you're looking for: transparently doing a
skip scan over the chunks of your HBase data based on your SQL
would be larger lazy CFs or/and low percentage of values
selected.
Can you try to increase the 2nd CF values' size and rerun the test?
On Mon, Apr 8, 2013 at 10:38 AM, James Taylor jtay...@salesforce.comwrote:
In the TestJoinedScanners.java, is the 40% randomly distributed or
sequential?
In our
Phoenix will parallelize within a region:
SELECT count(1) FROM orders
I agree with Ted, though, even serially, 100,000 rows shouldn't take any where
near 6 mins. You say 100,000 rows. Can you tell us what it's ?
Thanks,
James
On Apr 19, 2013, at 2:37 AM, Ted Yu yuzhih...@gmail.com wrote:
On 04/25/2013 03:35 PM, Gary Helmling wrote:
I'm looking to write a service that runs alongside the region servers and
acts a proxy b/w my application and the region servers.
I plan to use the logic in HBase client's HConnectionManager, to segment
my request of 1M rowkeys into sub-requests per
Thanks for the additional info, Sudarshan. This would fit well with the
implementation of Phoenix's skip scan.
CREATE TABLE t (
object_id INTEGER NOT NULL,
field_type INTEGER NOT NULL,
attrib_id INTEGER NOT NULL,
value BIGINT
CONSTRAINT pk PRIMARY KEY (object_id, field_type,
Our performance engineer, Mujtaba Chohan has agreed to put together a
benchmark for you. We only have a four node cluster of pretty average
boxes, but it should give you an idea.
No performance impact for the attrib_id not being part of the PK since
you're not filtering on them (if I
Phoenix will succeed if HBase succeeds. Phoenix just makes it easier to
drive HBase to it's maximum capability. IMHO, if HBase is to make
further gains in the OLAP space, scans need to be faster and new, more
compressed columnar-store type block formats need to be developed.
Running inside
bq. The downside that I see, is the bucket_number that we have to
maintain both at time or reading/writing and update it in case of
cluster restructuring.
I agree that this maintenance can be painful. However, Phoenix
(https://github.com/forcedotcom/phoenix) now supports salting,
automating
Have you had a look at Phoenix (https://github.com/forcedotcom/phoenix)? It'll
use all of the parts of your row key and depending on how much data you're
returning back to the client, will query over 10 million row in seconds.
James
@JamesPlusPlus
http://phoenix-hbase.blogspot.com
On Apr 30,
Sudarshan,
Below are the results that Mujtaba put together. He put together two
version of your schema: one with the ATTRIBID as part of the row key
and one with it as a key value. He also benchmarked the query time both
when all of the data was in the cache versus when all of the data was
read
Hi Amit,
Using Phoenix, the SQL skin over HBase
(https://github.com/forcedotcom/phoenix), you'd do this:
select * from myTable where value1 is null or value2 is null
Regards,
James
http://phoenix-hbase.blogspot.com
@JamesPlusPlus
On May 14, 2013, at 6:56 AM, samar.opensource
We are pleased to announce the immediate availability of Phoenix 1.2
(https://github.com/forcedotcom/phoenix/wiki/Download). Here are some of
the release highlights:
* Improve performance of multi-point and multi-range queries (20x plus)
using new skip scan
* Support TopN queries (3-70x
similar stuff in
https://issues.apache.org/jira/browse/HBASE-7474. I am interested in
knowing the details about that implementation.
Thanks,
Anil Gupta
On Thu, May 16, 2013 at 12:29 PM, James Taylor jtay...@salesforce.comwrote:
We are pleased to announce the immediate availability of Phoenix 1.2
name/classes?
I haven't got the opportunity to try out Phoenix yet but i would like to
have a look at the implementation.
Thanks,
Anil Gupta
On Thu, May 16, 2013 at 4:15 PM, James Taylor jtay...@salesforce.comwrote:
Hi Anil,
No HBase changes were required. We're already leveraging coprocessors
Hi Vivek,
Take a look at the SQL skin for HBase called Phoenix
(https://github.com/forcedotcom/phoenix). Instead of using the native
HBase client, you use regular JDBC and Phoenix takes care of making the
native HBase calls for you.
We support composite row keys, so you could form your row
give you a bit more
detail.
Regards,
James
On 05/20/2013 04:07 AM, Azuryy Yu wrote:
why off-list? it would be better share here.
--Send from my Sony mobile.
On May 18, 2013 12:14 AM, James Taylor jtay...@salesforce.com wrote:
Anil,
Yes, everything is in the Phoenix GitHub repo. Will give you
We're seen reasonable performance, with the caveat that you need to
parallelize the scan doing the aggregation. In our benchmarking, we have
the client scan each region in parallel and have a coprocessor aggregate
the row count and return a single row back (with the client then
totaling the
iwannaplay games funnlearnforkids@... writes:
Hi ,
I want to run query like
select month(eventdate),scene,count(1),sum(timespent) from eventlog
group by month(eventdate),scene
in hbase.Through hive its taking a lot of time for 40 million
records.Do we have any syntax in hbase to find
No, there's no sorted dimension. This would be a full table scan over
40M rows. This assumes the following:
1) your regions are evenly distributed across a four node cluster
2) unique combinations of month * scene are small enough to fit into memory
3) you chunk it up on the client side and run
Hey JM,
Can you expand on what you mean? Phoenix is a single jar, easily
deployed to any HBase cluster. It can map to existing HBase tables or
create new ones. It allows you to use SQL (a fairly popular language) to
query your data, and it surfaces it's functionality as a JDBC driver so
that
Hi Aji,
With Phoenix, you pass through the client port in your connection
string, so this would not be an issue. If you're familiar with SQL
Developer, then Phoenix supports something similar with SQuirrel:
https://github.com/forcedotcom/phoenix#sql-client
Regards,
James
On 05/22/2013 07:42
I did not try Phoenix yet, but I think you need to
upload the JAR on all the region servers first, and then restart them,
right? People might not have the rights to do that. That's why I thought
Pheonix was overkill regarding the need to just list a table content on a
screen.
JM
2013/5/22 James
Another option is Phoenix (https://github.com/forcedotcom/phoenix),
where you'd do
SELECT count(*) FROM my_table
Regards,
James
On 05/28/2013 03:25 PM, Ted Yu wrote:
Take a look at http://hbase.apache.org/book.html#rowcounter
Cheers
On Tue, May 28, 2013 at 3:23 PM, Shahab Yunus
On 05/24/2013 02:50 PM, Andrew Purtell wrote:
On Thu, May 23, 2013 at 5:10 PM, James Taylor jtay...@salesforce.comwrote:
Has there been any discussions on running the HBase server in an OSGi
container?
I believe the only discussions have been on avoiding talk about coprocessor
reloading
...@apache.orgjavascript:;
wrote:
On Thu, May 23, 2013 at 5:10 PM, James Taylor
jtay...@salesforce.comjavascript:;
wrote:
Has there been any discussions on running the HBase server in an OSGi
container?
I believe the only discussions have been on avoiding talk about
coprocessor
reloading
Hi Tony,
Have you had a look at Phoenix(https://github.com/forcedotcom/phoenix), a SQL
skin over HBase? It has a skip scan that will let you model a multi part row
key and skip through it efficiently as you've described. Take a look at this
blog for more info:
You'll need to flip the sign bit for ints and longs like Phoenix does.
Feel free to borrow our serializers (in PDataType) or just use Phoenix.
Thanks,
James
On 06/26/2013 12:16 AM, Madhukar Pandey wrote:
Please ignore my previous mail..there was some copy paste issue in it..
this is the
Hi Kristoffer,
Have you had a look at Phoenix (https://github.com/forcedotcom/phoenix)? You
could model your schema much like an O/R mapper and issue SQL queries through
Phoenix for your filtering.
James
@JamesPlusPlus
http://phoenix-hbase.blogspot.com
On Jun 27, 2013, at 4:39 PM, Kristoffer
Hi Flavio,
Have you had a look at Phoenix (https://github.com/forcedotcom/phoenix)?
It will allow you to model your multi-part row key like this:
CREATE TABLE flavio.analytics (
source INTEGER,
type INTEGER,
qual VARCHAR,
hash VARCHAR,
ts DATE
CONSTRAINT pk PRIMARY KEY
to have balanced regions as much as possible.
So I think that in this case I will still use Bytes concatenation if
someone confirm I'm doing it in the right way.
On Wed, Jul 3, 2013 at 12:33 PM, James Taylor jtay...@salesforce.comwrote:
Hi Flavio,
Have you had a look at Phoenix
(https
Hey Kiru,
Another option for you may be to use Phoenix (
https://github.com/forcedotcom/phoenix). In particular, our skip scan may
be what you're looking for:
http://phoenix-hbase.blogspot.com/2013/05/demystifying-skip-scan-in-phoenix.html.
Under-the-covers, the skip scan is doing a series of
Fantastic! Let me know if you're up for surfacing this through Phoenix.
Regards,
James
On Tue, Aug 13, 2013 at 7:48 AM, Anil Gupta anilgupt...@gmail.com wrote:
Excited to see this!
Best Regards,
Anil
On Aug 13, 2013, at 6:17 AM, zhzf jeff jeff.z...@gmail.com wrote:
very google local
Would be interesting to compare against Phoenix's Skip Scan
(http://phoenix-hbase.blogspot.com/2013/05/demystifying-skip-scan-in-phoenix.html)
which does a scan through a coprocessor and is more than 2x faster
than multi Get (plus handles multi-range scans in addition to point
gets).
James
On
), I will try
to bench mark this table alone against Phoenix on another cluster. Thanks.
Regards,
- kiru
Kiru Pakkirisamy | webcloudtech.wordpress.com
From: James Taylor jtay...@salesforce.com
To: user@hbase.apache.org user@hbase.apache.org
Cc: Kiru
--
*From:* James Taylor jtay...@salesforce.com
*To:* user@hbase.apache.org; Kiru Pakkirisamy kirupakkiris...@yahoo.com
*Sent:* Sunday, August 18, 2013 2:07 PM
*Subject:* Re: Client Get vs Coprocessor scan performance
Kiru,
If you're able to post the key values, row key
it).
Is there a way to do a sort of user defined function on a column ? that
would take care of my calculation on that double.
Thanks again.
Regards,
- kiru
Kiru Pakkirisamy | webcloudtech.wordpress.com
From: James Taylor jtay...@salesforce.com
Or if you'd like to be able to use SQL directly on it, take a look at
Phoenix (https://github.com/forcedotcom/phoenix).
James
On Aug 27, 2013, at 8:14 PM, Jean-Marc Spaggiari
jean-m...@spaggiari.org wrote:
Take a look at sqoop?
Le 2013-08-27 23:08, ch huang justlo...@gmail.com a écrit :
+1 to what Nicolas said.
That goes for Phoenix as well. It's open source too. We do plan to port to
0.96 when our user community (Salesforce.com, of course, being one of them)
demands it.
Thanks,
James
On Wed, Sep 4, 2013 at 10:11 AM, Nicolas Liochon nkey...@gmail.com wrote:
It's open
Hey Kiru,
The Phoenix team would be happy to work with you to benchmark your
performance if you can give us specifics about your schema design, queries,
and data sizes. We did something similar for Sudarshan for a Bloomberg use
case here[1].
Thanks,
James
[1].
When a table is created with Phoenix, its HBase table is configured
with the Phoenix coprocessors. We do not specify a jar path, so the
Phoenix jar that contains the coprocessor implementation classes must
be on the classpath of the region server.
In addition to coprocessors, Phoenix relies on
Use Phoenix (https://github.com/forcedotcom/phoenix) by doing the following:
CREATE VIEW myHTableName (key VARBINARY NOT NULL PRIMARY KEY);
SELECT COUNT(*) FROM myHTableName;
As fenghong...@xiaomi.com said, you still need to scan the table, but
Phoenix will do it in parallel and use a coprocessor
/lib? Our
customers said it has to. But I feel it is unnecessary and weird. Can you
confirm?
Thanks
Tian-Ying
-Original Message-
From: James Taylor [mailto:jtay...@salesforce.com]
Sent: Tuesday, September 10, 2013 4:40 PM
To: user@hbase.apache.org
Subject: Re: deploy saleforce
Hey Anil,
The solution you've described is the best we've found for Phoenix (inspired
by the work of Alex at Sematext).
You can do all of this in a few lines of SQL:
CREATE TABLE event_data(
who VARCHAR, type SMALLINT, id BIGINT, when DATE, payload VARBINARY
CONSTRAINT pk PRIMARY KEY
Take a look at Phoenix(https://github.com/forcedotcom/phoenix). It supports
both salting and fuzzy row filtering through its skip scan.
On Sun, Oct 20, 2013 at 10:42 PM, Premal Shah premal.j.s...@gmail.comwrote:
Have you looked at FuzzyRowFilter? Seems to me that it might satisfy your
Phoenix restricts salting to a single byte.
Salting perhaps is misnamed, as the salt byte is a stable hash based on the
row key.
Phoenix's skip scan supports sub-key ranges.
We've found salting in general to be faster (though there are cases where
it's not), as it ensures better parallelization.
this is the base
access pattern.
HTH
-Mike
On Oct 21, 2013, at 11:37 AM, James Taylor jtay...@salesforce.com wrote:
Phoenix restricts salting to a single byte.
Salting perhaps is misnamed, as the salt byte is a stable hash based on
the
row key.
Phoenix's skip scan supports sub-key
of your regions will be 1/2 the max size… but the size you really
want and 8-16 regions will be up to twice as big.
On Oct 21, 2013, at 3:26 PM, James Taylor jtay...@salesforce.com wrote:
What do you think it should be called, because
prepending-row-key-with-single-hashed-byte doesn't have
to, so you end up
with all regions half filled except for the last region in each 'modded'
value.
I wouldn't say its a bad thing if you plan for it.
On Oct 21, 2013, at 5:07 PM, James Taylor jtay...@salesforce.com wrote:
We don't truncate the hash, we mod it. Why would you expect that data
The Phoenix team is pleased to announce the immediate availability of
Phoenix 2.1 [1].
More than 20 individuals contributed to the release. Here are some of the
new features
now available:
* Secondary Indexing [2] to create and automatically maintain global
indexes over your
primary table.
-
yuzhih...@gmail.com wrote:
From https://github.com/forcedotcom/phoenix/wiki/Secondary-Indexing :
Is date_col a column from data table ?
CREATE INDEX my_index ON my_table (date_col DESC, v1) INCLUDE (v3)
SALT_BUCKETS=10, DATA_BLOCK_ENCODING='NONE';
On Thu, Oct 24, 2013 at 5:24 PM, James
Take a look at Phoenix (https://github.com/forcedotcom/phoenix) which
will allow you to issue SQL to directly create tables, insert data,
and run queries over HBase using the data model described below.
Thanks,
James
On Oct 28, 2013, at 8:47 AM, saiprabhur saiprab...@gmail.com wrote:
Hi Folks,
. as fast or
faster than a batched get).
Thanks,
James
On Mon, Oct 28, 2013 at 11:14 AM, Asaf Mesika asaf.mes...@gmail.com wrote:
I couldn't get the Row Value Constructor feature.
Do you perhaps have a real world use case to demonstrate this?
On Friday, October 25, 2013, James Taylor wrote
We ingest logs using Pig to write Phoenix-compliant HFiles, load those into
HBase and then use Phoenix (https://github.com/forcedotcom/phoenix) to
query directly over the HBase data through SQL.
Regards,
James
On Thu, Nov 14, 2013 at 9:35 AM, sam wu swu5...@gmail.com wrote:
we ingest data
One other tool option for you is to use Phoenix. You use SQL to create a
table and define the columns through standard DDL. Your columns make up the
allowed KeyValues for your table and the metadata is surfaced through the
standard JDBC metadata APIs (with column family mapping to table catalog).
FYI, you can define BLOCKSIZE in your hbase-sites.xml, just like with HBase
to make it global.
Thanks,
James
On Mon, Nov 25, 2013 at 9:08 PM, Azuryy Yu azury...@gmail.com wrote:
This is no way to declare global property in Phoneix, you have to
declare BLOCKSIZE
in each 'create' SQL.
such
Amit,
So sorry we didn't answer your question before - I'll post an answer now
over on our mailing list.
Thanks,
James
On Wed, Nov 27, 2013 at 8:46 AM, Amit Sela am...@infolinks.com wrote:
I actually asked some of these questions in the phoenix-hbase-user
googlegroup but never got an
I agree with Doug Meil's advice. Start with your row key design. In
Phoenix, your PRIMARY KEY CONSTRAINT defines your row key. You should lead
with the columns that you'll filter against most frequently. Then, take a
look at adding secondary indexes to speedup queries against other columns.
The Phoenix team is pleased to announce that Phoenix[1] has been accepted
as an Apache incubator project[2]. Over the next several weeks, we'll move
everything over to Apache and work toward our first release.
Happy to be part of the extended family.
Regards,
James
[1]
Mathan,
We already answered your question on the Phoenix mailing list. If you
have a follow up question, please post it there. This is not an HBase
issue.
Thanks,
James
On Dec 14, 2013, at 2:10 PM, mathan kumar immathanku...@gmail.com wrote:
-- Forwarded message --
From: x
FYI, scanner caching defaults to 1000 in Phoenix, but as folks have pointed
out, that's not relevant in this case b/c only a single row is returned
from the server for a COUNT(*) query.
On Sat, Dec 21, 2013 at 2:51 PM, Kristoffer Sjögren sto...@gmail.comwrote:
Yeah, im doing a count(*) query
Henning,
Jesse Yates wrote the back-end of our global secondary indexing system in
Phoenix. He designed it as a separate, pluggable module with no Phoenix
dependencies. Here's an overview of the feature:
https://github.com/forcedotcom/phoenix/wiki/Secondary-Indexing. The section
that discusses the
Otis,
I didn't realize Nutch uses HBase underneath. Might be interesting if you
serialized data in a Phoenix-compliant manner, as you could run SQL queries
directly on top of it.
Thanks,
James
On Thu, Jan 2, 2014 at 10:17 PM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:
Hi,
Have a
Hi LiLi,
Have a look at Phoenix (http://phoenix.incubator.apache.org/). It's a SQL
skin on top of HBase. You can model your schema and issue your queries just
like you would with MySQL. Something like this:
// Create table that optimizes for your most common query
// (i.e. the PRIMARY KEY
in your cluster. You can read more about salting here:
http://phoenix.incubator.apache.org/salted.html
On Thu, Jan 2, 2014 at 11:36 PM, Li Li fancye...@gmail.com wrote:
thank you. it's great.
On Fri, Jan 3, 2014 at 3:15 PM, James Taylor jtay...@salesforce.com
wrote:
Hi LiLi,
Have a look
do parallel scans for each bucket and do a merge sort on the
client, so the cost is pretty low for this (we also provide a way of
turning this off if your use case doesn't need it).
Two years, JM? Now you're really going to have to start using Phoenix :-)
On Friday, January 3, 2014, James Taylor
love you see if your implementation can fit into the framework we
wrote
- we would be happy to work to see if it needs some more hooks or
modifications - I have a feeling this is pretty much what you guys
will
need
-Jesse
On Mon, Dec 23, 2013 at 10:01 AM, James Taylor
jtay
great but it's now only a experimental project. I
want to use only hbase. could you tell me the difference of Phoenix
and hbase? If I use hbase only, how should I design the schema and
some extra things for my goal? thank you
On Sat, Jan 4, 2014 at 3:41 AM, James Taylor jtay...@salesforce.com
?
On Sat, Jan 4, 2014 at 3:43 PM, James Taylor jtay...@salesforce.com
wrote:
Hi LiLi,
Phoenix isn't an experimental project. We're on our 2.2 release, and many
companies (including the company for which I'm employed, Salesforce.com)
use it in production today.
Thanks,
James
Hi Henning,
My favorite implementation of efficient composite row keys is Phoenix. We
support composite row keys whose byte representation sorts according to the
natural sort order of the values (inspired by Lily). You can use our type
system independent of querying/inserting data with Phoenix,
Hi William,
Phoenix uses this bucket mod solution as well (
http://phoenix.incubator.apache.org/salted.html). For the scan, you have to
run it in every possible bucket. You can still do a range scan, you just
have to prepend the bucket number to the start/stop key of each scan you
do, and then you
, Jan 20, 2014 at 8:15 PM, James Taylor jtay...@salesforce.com
wrote:
Hi William,
Phoenix uses this bucket mod solution as well (
http://phoenix.incubator.apache.org/salted.html). For the scan, you have
to
run it in every possible bucket. You can still do a range scan, you just
have
Hi Jignesh,
Phoenix has support for multi-tenant tables:
http://phoenix.incubator.apache.org/multi-tenancy.html. Also, your primary
key constraint would transfer over as-is, since Phoenix supports composite
row keys. Essentially your pk constraint values get concatenated together
to form your row
1 - 100 of 138 matches
Mail list logo