Maybe this will help?
http://phoenix.apache.org/bulk_dataload.html#Permissions_issues_when_uploading_HFiles
bq. I struggle to understand how to use split points in the create
statement.
You can't always use split points - it depends on your schema and the
knowledge you have about the data being
Aaron,
Looks like a permission issue?
org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint threw
java.lang.IllegalStateException: Failed to get FileSystem instance
java.lang.IllegalStateException: Failed to get FileSystem instance
at
Gabriel,
Thanks for the response I appreciate it.
I struggle to understand how to use split points in the create statement.
(1) Creating a table with Split Points:
CREATE TABLE stats.prod_metrics ( host char(50) not null, created_date date not
null,
txn_count bigint CONSTRAINT pk
Hi John,
You can actually pre-split a table when creating it, either by
specifying split points in the CREATE TABLE statement[1] or by using
salt buckets[2]. In my current use cases I always use salting, but
take a look at the salting documentation[2] for the pros and cons of
this.
Your
Hi Aaron,
How many regions are there in the LINEITEM table? The fact that you
needed to bump the
hbase.mapreduce.bulkload.max.hfiles.perRegion.perFamily setting up to
48 suggests that the amount of data going into a single region of that
table is probably pretty large.
Along the same line, I
Gabriel,
Thanks for the help, it's good to know that those params can be passed from the
command line and that the order is important.
I am trying to load the 100GB TPC-H data set and ultimately run the TPC-H
queries. All of the tables loaded relatively easily except LINEITEM (the
Hi Aaron,
I'll answered your questions directly first, but please see the bottom
part of this mail for important additional details.
You can specify the
"hbase.mapreduce.bulkload.max.hfiles.perRegion.perFamily" parameter
(referenced from your StackOverflow link) on the command line of you
Hi all I'm running the CsvBulkLoadTool trying to pull in some data. The
MapReduce Job appears to complete, and gives some promising information:
Phoenix MapReduce Import
Upserts