Hi, I would like to perform a Bulk insert to HBase using Apache Phoenix from Spark. I tried using Apache Spark Phoenix library but, as far as I was able to understand from the code, it looks like it performs a jdbc batch of upserts (am I right?). Instead I want to perform a Bulk load like the one described in this blog post (https://zeyuanxy.github.io/HBase-Bulk-Loading/) but taking advance of the automatic transformation between java/scala types to Bytes.
I'm actually using phoenix 4.5.2, therefore I cannot use hive to manipulate the phoenix table, and if it possible i want to avoid to spawn a MR job that reads data from csv (https://phoenix.apache.org/bulk_dataload.html). Actually i just want to do what the csv loader is doing with MR but programmatically with Spark (since the data I want to persist is already loaded in memory). Thank you all!