Hi Dawid, that's great!! Of course, whenever you can. I have actually found out that the saveToPhoenix() function provided is not so bad to use.
Thanks! On 16 June 2015 at 15:08, Dawid Wysakowicz <[email protected]> wrote: > Hi Yiannis, > I've resolved the issue when I've run the code on bigger set of data. I > will try to post the code when I polish it a bit. The partitions should be > sorted with KeyValue sorter before bulkSaving them. > > 2015-06-16 15:10 GMT+02:00 Yiannis Gkoufas <[email protected]>: > >> Hi, >> >> didn't realize that I only sent to Dawid. >> Resending to the entire list in case someone else has encountered this >> error before: >> >> 15/06/10 23:45:16 WARN TaskSetManager: Lost task 34.48 in stage 0.0 (TID >> 816, iriclusnd20): java.io.IOException: Added a key not lexically larger >> than previous >> key=\x00\x17\x083661310846GMP\x00\x00\x00\x01E\xF3jH@\x010GEN\x00\x00\x01M\xDF\xA6!\xFF\x04, >> lastkey=\x00\x17\x1E7359530994GMP\x00\x00\x00\x01@ >> \xD4\xFE\xC0\xC0\x010_0\x00\x00\x01M\xDF\xA6!\xFF\x04 >> at >> org.apache.hadoop.hbase.io.hfile.AbstractHFileWriter.checkKey(AbstractHFileWriter.java:202) >> at >> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:288) >> at >> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:253) >> at >> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:935) >> at >> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:196) >> at >> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:149) >> at >> org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:1000) >> at >> org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:979) >> at >> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) >> at org.apache.spark.scheduler.Task.run(Task.scala:64) >> at >> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> >> I get the above error multiple times. >> The HDFS path is fine, there is no error about that. >> >> Thanks! >> >> On 11 June 2015 at 17:49, Dawid <[email protected]> wrote: >> >>> Hi, >>> Your code seems ok to me, the only difference with what I do is that I >>> explicitly pass hdfs path to bulkSave, I am not sure how "/bulk is resolved. >>> I am very beginner with spark, hbase, phoenix etc. but if you'd like to >>> use this code I could try to investigate your problem, but I need the full >>> stack trace. >>> >>> >>> >>> On 11.06.2015 00:53, Yiannis Gkoufas wrote: >>> >>> Hi Dawid, >>> >>> yes I have been using your code. Probably I am invoking the classes in >>> a wrong way. >>> >>> val data = readings.map(e => e.split(",")) >>> .map(e => (e(0),e(1).toLong,e(2).toDouble,e(3).toDouble)) >>> val tableName = "TABLE";val columns = Seq("SMID","DT","US","GEN");val zkUrl >>> = Some("localhost:2181"); >>> val functions = new ExtendedProductRDDFunctions(data);val hfiles = >>> functions.toHFile(tableName,columns,new Configuration,zkUrl);val loader = >>> new BulkPhoenixLoader(hfiles); >>> loader.bulkSave(tableName,"/bulk",None); >>> >>> >>> Does the above seem the correct way to you? >>> >>> >>> Thanks a lot! >>> >>> >>> On 10 June 2015 at 19:13, Dawid <[email protected]> wrote: >>> >>>> Thx a lot James. That's the case. >>>> >>>> >>>> >>>> On 10.06.2015 19:50, James Taylor wrote: >>>> >>>>> David, >>>>> It might be timestamp related. Check the timestamp of the rows/cells >>>>> you imported from the HBase shell. Are the timestamps later than the >>>>> server timestamp? In that case, you wouldn't see that data. If this is >>>>> the case, you can try specifying the CURRENT_SCN property at >>>>> connection time with a timestamp later than the timestamp of the >>>>> rows/cells to verify. >>>>> Thanks, >>>>> James >>>>> >>>>> On Wed, Jun 10, 2015 at 10:14 AM, Dawid <[email protected]> >>>>> wrote: >>>>> >>>>>> Yes, that's right I have generated HFile's that I managed to load so >>>>>> to be >>>>>> visible in HBase. I can't make them 'visible' to phoenix. >>>>>> >>>>>> What I noticed today is I have rows loaded from the generated HFiles >>>>>> and >>>>>> upserted through sqlline when I run 'DELETE FROM TABLE' only the >>>>>> upserted >>>>>> one disappears. The loaded from HFiles still persist in HBase. >>>>>> >>>>>> Yiannis how do you generate the HFiles? You can see my code here: >>>>>> https://gist.github.com/dawidwys/3aba8ba618140756da7c >>>>>> >>>>>> >>>>>> On 10.06.2015 17:57, Yiannis Gkoufas wrote: >>>>>> >>>>>> Hi Dawid, >>>>>> >>>>>> I am trying to do the same thing but I hit a wall while writing the >>>>>> Hfiles >>>>>> getting the following error: >>>>>> >>>>>> java.io.IOException: Added a key not lexically larger than previous >>>>>> >>>>>> key=\x00\x168675230967GMP\x00\x00\x00\x01=\xF4h)\xE0\x010GEN\x00\x00\x01M\xDE.\xB4T\x04, >>>>>> >>>>>> lastkey=\x00\x168675230967GMP\x00\x00\x00\x01=\xF5\x0C\xF5`\x010_0\x00\x00\x01M\xDE.\xB4T\x04 >>>>>> >>>>>> You have reached the point where you are generating the HFiles, >>>>>> loading them >>>>>> but you dont see any rows in the table? >>>>>> Is that correct? >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> On 8 June 2015 at 18:09, Dawid <[email protected]> wrote: >>>>>> >>>>>>> >>>>>>> Yes, I did. I also tried to execute some upserts using sqlline after >>>>>>> importing HFiles, and rows from upserts are visible both in sqlline >>>>>>> and >>>>>>> hbase shell, but >>>>>>> the rows imported from HFile are only in hbase shell. >>>>>>> >>>>>>> >>>>>>> On 08.06.2015 19:06, James Taylor wrote: >>>>>>> >>>>>>>> Dawid, >>>>>>>> Perhaps a dumb question, but did you execute a CREATE TABLE >>>>>>>> statement >>>>>>>> in sqlline for the tables you're importing into? Phoenix needs to be >>>>>>>> told the schema of the table (i.e. it's not enough to just create >>>>>>>> the >>>>>>>> table in HBase). >>>>>>>> Thanks, >>>>>>>> James >>>>>>>> >>>>>>>> On Mon, Jun 8, 2015 at 10:02 AM, Dawid <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Any suggestions? Some clues what to check? >>>>>>>>> >>>>>>>>> >>>>>>>>> On 05.06.2015 23:21, Dawid wrote: >>>>>>>>> >>>>>>>>> Yes I can see it in hbase-shell. >>>>>>>>> >>>>>>>>> Sorry for the bad links, i haven't used private repositories on >>>>>>>>> github. >>>>>>>>> So I >>>>>>>>> moved the files to a gist: >>>>>>>>> https://gist.github.com/dawidwys/3aba8ba618140756da7c >>>>>>>>> Hope this times it will work. >>>>>>>>> >>>>>>>>> On 05.06.2015 23:09, Ravi Kiran wrote: >>>>>>>>> >>>>>>>>> Hi Dawid, >>>>>>>>> Do you see the data when you run a simple scan or count of >>>>>>>>> the table >>>>>>>>> in >>>>>>>>> Hbase shell ? >>>>>>>>> >>>>>>>>> FYI. The links lead me to a 404 : File not found. >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> Ravi >>>>>>>>> >>>>>>>>> On Fri, Jun 5, 2015 at 1:17 PM, Dawid <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> I was trying to code some utilities to bulk load data through >>>>>>>>>> HFiles >>>>>>>>>> from >>>>>>>>>> Spark RDDs. >>>>>>>>>> I was trying to took the pattern of CSVBulkLoadTool. I managed to >>>>>>>>>> generate >>>>>>>>>> some HFiles and load them into HBase, but i can't see the rows >>>>>>>>>> using >>>>>>>>>> sqlline. I would be more than grateful for any suggestions. >>>>>>>>>> >>>>>>>>>> The classes can be accessed at: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> https://github.com/dawidwys/gate/blob/master/src/main/scala/pl/edu/pw/elka/phoenix/BulkPhoenixLoader.scala >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> https://github.com/dawidwys/gate/blob/master/src/main/scala/pl/edu/pw/elka/phoenix/ExtendedProductRDDFunctions.scala >>>>>>>>>> >>>>>>>>>> Thanks in advance >>>>>>>>>> >>>>>>>>>> Dawid Wysakowicz >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>> Pozdrawiam >>>>>>>>> Dawid >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Pozdrawiam >>>>>>>>> Dawid >>>>>>>>> >>>>>>>> >>>>>>> -- >>>>>>> Pozdrawiam >>>>>>> Dawid >>>>>>> >>>>>>> >>>>>> -- >>>>>> Pozdrawiam >>>>>> Dawid >>>>>> >>>>> >>>> -- >>>> Pozdrawiam >>>> Dawid >>>> >>>> >>> >>> -- >>> Pozdrawiam >>> Dawid >>> >>> >> >
