Some more details... We have done some simple tests to compare read/write
possibility spark+hive and spark+phoenix. And now we have the following results:
Copy table (with no any transformations) (about 800 million rec):
Hive (TEZ) - 752 sec
Spark:
>From Hive to Hive: 2463 sec
>From Phoenix to H
Hello,
I use the Phoenix Spark plugin to load data from HBase.
There is the SparkSqlContextFunctions.phoenixTableAsDataFrame() method which
allows to get a Dataset
for the given table name, columns and a predicate.
Is it possible to also provide LIMIT statement so the number of the retrieved
r
You can do it directly with spark sql
Xavier
On 2018-03-07 06:38 AM, alexander.scherba...@yandex.com wrote:
Hello,
I use the Phoenix Spark plugin to load data from HBase.
There is the SparkSqlContextFunctions.phoenixTableAsDataFrame() method which
allows to get a Dataset
for the given table
Does it work that only the limited number of rows will be sent from the each
HBase Region Server to the client?
I just ask because I can use the WHERE statement in the same way in the Spark
SQL instead of passing the predicate.
Thanks,
Alexandr.
07.03.2018, 15:35, "Xavier Jodoin" :
> You can d
it will limit the number of rows fetched by the client
On 2018-03-07 07:54 AM, alexander.scherba...@yandex.com wrote:
Does it work that only the limited number of rows will be sent from the each
HBase Region Server to the client?
I just ask because I can use the WHERE statement in the same wa
Is there a documentation which describes which queries and how will be
propagated to the server during data fetching for the Phoenix Spark?
Thanks,
Alexandr.
07.03.2018, 16:24, "Xavier Jodoin" :
> it will limit the number of rows fetched by the client
>
> On 2018-03-07 07:54 AM, alexander.sche
We found https://issues.apache.org/jira/browse/PHOENIX-3547, which seems to
be precisely our problem. We would want at least the option to use a bigint
rather than the int in the JIRA to accommodate massive growth. While we
intend to have many tenants, we don't intend to use the Phoenix "tenant_id"