As I read more Phoenix code, I feel that I should do:
1. Use `PhoenixRuntime.getTable` to get a `PTable`
2. Use `table.getPKColumns` to get a list of `PColumn`s
3. For each column, use `column.getDataType`; then `dataType.toBytes(value,
column.getSortOrder)`
4. Finally, create a new
Thank you for your reply. I tried passing the PKs through IN clause. But
the number of PKs to match between files and Phoenix table some times can
be 70 million and i felt it will be much slower if i use IN clause. May i
know how much PKs you passed through IN clause ?
On Tue, Jul 12, 2016 at
I actually recently did something similar. If you are joining on primary keys,
you can do batch query with the IN clause.
> On Jul 11, 2016, at 9:05 PM, Mohanraj Ragupathiraj
> wrote:
>
> Hi,
>
> I have a Scenario in which i have to load a phoenix table as a whole
Hi,
I have a Scenario in which i have to load a phoenix table as a *whole *and
join it with multiple files in Spark. But it takes around 30 minutes just
to read 600 million records from the Phoenix table. I feel it is
inappropriate to load full table data, as HBase works best for Random
lookups.
Thanks Mujtaba. This is good to know. It is possible manipulate the key bit to
avoid the hot-spotting, so we are probably trying unsalted table out.
Still, it would be nice if combined indexes in a single table is possible.
> On Jul 11, 2016, at 2:41 PM, Mujtaba Chohan
FYI if you keys are not written in order i.e. you are not concerned about
write hot-spotting/write throughput then try writing your data to an
un-salted table. Read performance for un-salted table can be comparable or
better to salted one with stats
This indexes will be salted indeed. (so is the data table). If all indexes
reside in the same table, there will be only 512 regions in total (256 for data
table, 256 for the combined index table). Indeed the combined index table will
be 12x large as a single index table. But it doesn’t cover
Will the index be salted (and that's why it's 256 regions per table)? If
not, how many regions would there be if all indexes are in the same table
(assuming the table is 12x bigger than one index table)?
On Monday, July 11, 2016, Simon Wang wrote:
> Thanks, Mujtaba. What
Thanks, Mujtaba. What you wrote is exactly what I meant. While not all our
tables needs these many regions and indexes, the num of regions/region server
can grow quickly.
-Simon
> On Jul 11, 2016, at 2:17 PM, Mujtaba Chohan wrote:
>
> 12 index tables * 256 region per
12 index tables * 256 region per table = ~3K regions for index tables
assuming we are talking of covered index which implies 200+ regions/region
server on a 15 node cluster.
On Mon, Jul 11, 2016 at 1:58 PM, James Taylor
wrote:
> Hi Simon,
>
> I might be missing
Hi Simon,
I might be missing something, but with 12 separate index tables or 1 index
table, the amount of data will be the same. Won't there be the same number
of regions either way?
Thanks,
James
On Sun, Jul 10, 2016 at 10:50 PM, Simon Wang wrote:
> Hi James,
>
>
11 matches
Mail list logo