Hi, John. I checked out your code and it looks good :) I found that you use javafx, but that is not present in OpenJDK and fails to compile, and since we don't stick to Oracle JVM I would suggest to change it.
Good job, keep it going :) Regards, Alfonso Nishikawa El sáb., 20 jul. 2019 a las 22:25, John Mora (<jhnmora...@gmail.com>) escribió: > Hi. > > I updated my report in the Wiki[1]. Also, I pushed my last commits to my > branch [2]. Please give it a look if you have time. > > This week, I will give a look to the map reduce tests for DataStores. > > Please let me know if you have suggestions. > > [1] > https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports > [2] https://github.com/jhnmora000/gora/tree/GORA-485 > > Thanks, > John > > El sáb., 13 jul. 2019 a las 19:31, John Mora (<jhnmora...@gmail.com>) > escribió: > >> Hi all >> >> I updated my report in the Wiki[1]. Also, I pushed my last commits to my >> branch [2]. Please give it a look if you have time. >> >> This week, I will be working in the getPartitions and deleteByQuery >> methods and testing the other tests in the DataStoreTestBase class. >> >> Please let me know if you have suggestions. >> >> [1] >> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports >> [2] https://github.com/jhnmora000/gora/tree/GORA-485 >> >> Best, >> John. >> >> El mié., 10 jul. 2019 a las 16:17, John Mora (<jhnmora...@gmail.com>) >> escribió: >> >>> Hi Alfonso, >>> >>> Thanks so much for your time and support for this project. I will work >>> on your comments. Responses inline :) >>> >>> >>> El mar., 9 jul. 2019 a las 16:38, Alfonso Nishikawa (< >>> alfonso.nishik...@gmail.com>) escribió: >>> >>>> Hi, John. >>>> >>>> Sorry for the delay, I am changing work and I have been very busy :( I >>>> will try to answer your questions :) >>>> >>>> *> In the Employee example there is a field called 'dateOfBirth'. I >>>> tried to map that field with the UNIXTIME_MICROS datatype of Kudu (I >>>> intuitively assumed this is a date.). However, in the java world the >>>> Employee field is a Long value and the kudu datatype is a Timestamp. So, I >>>> was wondering whether I should force the usage of the UNIXTIME_MICROS >>>> datatype for this field or just use a LONG datatype in Kudu.* >>>> >>>> In Avro 1.8 were introduced "Logical Types" so there is a "date" type >>>> with an underlying "int" [1]. It's the first time I read about because >>>> until the last version upgrade of Avro this weren't there. I would suggest >>>> to ignore "dates" and map dateOfBirth as long, since in any case -in avro- >>>> the value is the unix epoch. After this first approach, a design >>>> improvement would be great, though :) >>>> >>>> - Would be good to have in the mapping a "timestamp" type so KuduStore >>>> converts between the Entity long field <-> Kudu timestamp storage? >>>> - Is there any other approach? >>>> >>> >>> I think that Entity long field <-> Kudu timestamp conversion that the >>> best alternative right now. Because, I would add more compatible datatypes >>> to the mapping parameters which users can use. And this conversion should >>> not be dificult to implement in my opinion. Also, the new Date datatype of >>> avro could be implemented in newer versions because it would need further >>> analysis in other datastores too. I will work on that. >>> >>> >>>> >>>> >>>> *> What is the Gora's policy regarding flush()? * >>>> *> KuduClient has multiple flushing modes >>>> <https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html>and >>>> also can set time interval >>>> <https://kudu.apache.org/releases/1.2.0/apidocs/org/apache/kudu/client/KuduSession.html#setFlushInterval-int-> >>>> for automatic flush.* >>>> *> Should theses behaviors be configurable using gora.properties file? >>>> or just use the default configurations.* >>>> >>>> What we do in HBase is configure an autoflush option in gora.properties >>>> [2] which is used when instanced the Table, but at the same time we >>>> implement the flush() method to force the flush [3]. I would suggest to >>>> follow that example, but adding the flushing options of Kudu. What flushing >>>> mode (and time interval if it applies) do you suggest? >>>> >>> >>> Well, IMHO the default flush mode (auto flush sync) will do the job for >>> most use cases. But I will add a configuration in gora.properties for >>> selecting the other modes and specifying a autoflush time if needed by >>> the user. >>> >>> >>>> >>>> *> Also, while reviewing the datastore interface I noticed this method >>>> 'getPartitions(Query<K, T> query)'. What is the expected behavior of this >>>> method?, should I use the partition definition in the xml mapping file for >>>> this?.* >>>> >>>> The method getPartitions(Query) is related to Hadoop. Apache Gora >>>> integrates with Hadoop implementing a custom Map and Reduce that allows to >>>> get/write Entities directly. >>>> You can take a look at HBase's implementation [4], which relies >>>> o.a.h.hbase.mapreduce.TableInputFormatBase >>>> [5] to compute the splits (start key---end key) with the location of the >>>> split to create a colection of partitions [6]. >>>> >>>> So, if Kudu is allowed to perform computation using local kudu splits, >>>> then this method does the needed preparation to allow to "send the >>>> computation to where the data is locally". >>>> >>>> In any case, you can see that: >>>> >>>> - MongoDB store implementation does not implement splitting [7] >>>> - Cassandra store implementation does not implement splitting [8] >>>> - Aerospike store implementation does not implement splitting [9] >>>> - Accumulo store implementation* does* implement splitting [10] >>>> >>>> If Kudu has a method to get the different splits for a table and its >>>> locations, then you will be able to implement the full feature. >>>> >>>> This is Hadoop related and it is not trivial. I haven't elaborated >>>> much, so if you find you need more information let me know :) >>>> >>>> >>>> >>> I will check whether Kudu has these features in order to implement this >>> method. If not I will use the default implementation found in other >>> backends. >>> >>> >>>> About Queries, what I can tell is that Hbase only implements "Start >>>> key" + "End key" because it has only 2 operations: "get" and "scan", and >>>> the querying is for "scan" operation, were you want an interval (or all) of >>>> the rows. Does Kudu have more querying functionality? >>>> >>>> >>> Yes, Kudu implements a Scanner for querying data among with conditional >>> predicates for filtering. I am using those classes. >>> >>> >>>> About other topic, I am trying to install Kudu in standalone (all in 1 >>>> node). Do you use a Cloudera installation or do you have a standalone >>>> installation? How do you do it? I found some instructions, but they talk >>>> about compiling Kudu [11]. I was looking for something like HBase, that it >>>> is unzip + execute "hbase start". >>>> >>>> >>> I am using an embedded mini-cluster which comes with compiled binaries >>> and can be used with maven[1] for testing my code. Once I get it mature >>> enough I think I will be testing the datastore with a docker container [2]. >>> I could not find a unzip+execute bundle either and I am kinda noob for >>> compiling it myself. >>> >>> [1] >>> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing >>> [2] https://hub.docker.com/r/usuresearch/apache-kudu/ >>> >>> >>>> Good job and thank you!! :) >>>> >>>> Regards, >>>> >>>> Alfonso Nishikawa >>>> >>>> >>>> [1] - https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types >>>> [2] - >>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L175 >>>> [3] - >>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L458 >>>> [4] - >>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L472 >>>> [5] - >>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L479 >>>> [6] - >>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L517 >>>> [7] - >>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-mongodb/src/main/java/org/apache/gora/mongodb/store/MongoStore.java#L533 >>>> [8] - >>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L292 >>>> [9] - >>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-aerospike/src/main/java/org/apache/gora/aerospike/store/AerospikeStore.java#L369 >>>> [10] - >>>> https://github.com/apache/gora/blob/apache-gora-0.9/gora-accumulo/src/main/java/org/apache/gora/accumulo/store/AccumuloStore.java#L902 >>>> [11] - https://kudu.apache.org/docs/installation.html >>>> >>>> >>>> El lun., 8 jul. 2019 a las 3:42, John Mora (<jhnmora...@gmail.com>) >>>> escribió: >>>> >>>>> Hi all. >>>>> >>>>> As every week I updated my report in the Wiki[1]. Also, I pushed my >>>>> last commits to my branch [2]. Please give it a look if you have time. >>>>> >>>>> This week, I will be continue working in the Queries implementation, >>>>> please reach me out if you have any suggestions. >>>>> >>>>> Also, while reviewing the datastore interface I noticed this method >>>>> 'getPartitions(Query<K, T> query)'. What is the expected behavior of this >>>>> method?, should I use the partition definition in the xml mapping file for >>>>> this?. >>>>> >>>>> Cheers, >>>>> John. >>>>> >>>>> [1] >>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports >>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485 >>>>> >>>>> >>>>> El dom., 30 jun. 2019 a las 16:56, John Mora (<jhnmora...@gmail.com>) >>>>> escribió: >>>>> >>>>>> Hi all. >>>>>> >>>>>> I received my first evaluation from the Google Summer of Code program >>>>>> with a positive result. Thanks so much for your support and confidence to >>>>>> the project and me. >>>>>> >>>>>> I updated my report of this week in the Wiki[1]. Also, I pushed my >>>>>> last commits to my branch [2]. >>>>>> >>>>>> This week, I will be reviewing my the serialization/ deserialization >>>>>> process in order to identify optimizations specific for Kudu. Because I >>>>>> used a generic methods of other backends which probably could be better >>>>>> tuned for kudu. Also, I will start working on the Queries implementation. >>>>>> >>>>>> BTW, I added a question to the wiki about Date types. Please give it >>>>>> a look if you have time. >>>>>> >>>>>> [1] >>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports >>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485 >>>>>> >>>>>> Cheers, >>>>>> John >>>>>> >>>>>> El jue., 27 jun. 2019 a las 21:02, John Mora (<jhnmora...@gmail.com>) >>>>>> escribió: >>>>>> >>>>>>> Hi Carlos. >>>>>>> >>>>>>> Thanks for the reminder. I submitted the form yesterday. :D >>>>>>> >>>>>>> Best, >>>>>>> John. >>>>>>> >>>>>>> El jue., 27 jun. 2019 a las 17:34, carlos muñoz (< >>>>>>> carlosr...@gmail.com>) escribió: >>>>>>> >>>>>>>> Hi John >>>>>>>> >>>>>>>> The first Google Summer of Code evaluation is due on June 28th. >>>>>>>> Please make sure you submit your Mentors' evaluation on time. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Carlos >>>>>>>> >>>>>>>> El dom., 23 jun. 2019 a las 18:29, John Mora (<jhnmora...@gmail.com>) >>>>>>>> escribió: >>>>>>>> >>>>>>>>> Hi all. >>>>>>>>> >>>>>>>>> FYI, I updated my report of this week on the Wiki[1]. Also, I >>>>>>>>> pushed my last commits to my branch [2]. >>>>>>>>> >>>>>>>>> As I mentioned in the reports I would like to know how datastores >>>>>>>>> deal with flush(), should it work always manually executed?. >>>>>>>>> >>>>>>>>> Finally, This week I will be implementing object >>>>>>>>> serialization/deserialization in the methods put, get, delete, >>>>>>>>> exists. Do >>>>>>>>> you have any suggestions on how to proceed with this task?. >>>>>>>>> >>>>>>>>> Footnote: Thanks for the feedback Carlos, I fixed the problem. >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports >>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485 >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> John >>>>>>>>> >>>>>>>>> >>>>>>>>> El lun., 17 jun. 2019 a las 22:58, carlos muñoz (< >>>>>>>>> carlosr...@gmail.com>) escribió: >>>>>>>>> >>>>>>>>>> Hi John >>>>>>>>>> >>>>>>>>>> Your last changes look good to me. Keep it up. But, I noticed >>>>>>>>>> that you have created an Enumeration for datatypes, which is very >>>>>>>>>> similar >>>>>>>>>> to the kudu-client's [2]. Probably you should replace [1] for [2] in >>>>>>>>>> order >>>>>>>>>> to avoid code duplication. >>>>>>>>>> >>>>>>>>>> [1] >>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/Column.java#L76 >>>>>>>>>> [2] https://kudu.apache.org/apidocs/org/apache/kudu/Type.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Carlos >>>>>>>>>> >>>>>>>>>> El sáb., 15 jun. 2019 a las 12:01, John Mora (< >>>>>>>>>> jhnmora...@gmail.com>) escribió: >>>>>>>>>> >>>>>>>>>>> Hi all. >>>>>>>>>>> >>>>>>>>>>> I updated my report of this week on the Wiki[1]. I noticed that >>>>>>>>>>> my code is lacking some javadoc documentation I think I will be >>>>>>>>>>> working on >>>>>>>>>>> that this week, also I would like to enable and check schema >>>>>>>>>>> management >>>>>>>>>>> tests (createSchema, existsSchema, etc.). >>>>>>>>>>> >>>>>>>>>>> [1] >>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> John. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> El mar., 11 jun. 2019 a las 0:11, John Mora (< >>>>>>>>>>> jhnmora...@gmail.com>) escribió: >>>>>>>>>>> >>>>>>>>>>>> Hi Alfonso. >>>>>>>>>>>> >>>>>>>>>>>> Thanks so much for your feedback. I am working on your comments. >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> John >>>>>>>>>>>> >>>>>>>>>>>> El lun., 10 jun. 2019 a las 16:11, Alfonso Nishikawa (< >>>>>>>>>>>> alfonso.nishik...@gmail.com>) escribió: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, John. >>>>>>>>>>>>> >>>>>>>>>>>>> Regarding your questions at the report [1]: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> - How to represent partitioning configurations on the >>>>>>>>>>>>> mapping file. >>>>>>>>>>>>> >>>>>>>>>>>>> This was discussed in other emails, isn't it? :) >>>>>>>>>>>>> >>>>>>>>>>>>> - KuduTestHarness requires the Maven plugin >>>>>>>>>>>>> os-maven-plugin, which needs Maven 3.1.1+, is it a problem for >>>>>>>>>>>>> Apache Gora? >>>>>>>>>>>>> >>>>>>>>>>>>> I believe it is not a problem. My Ubuntu comes with 3.6.0, far >>>>>>>>>>>>> from 3.1.1, and I assume everyone uses Maven 3 in a quite new >>>>>>>>>>>>> version :) >>>>>>>>>>>>> >>>>>>>>>>>>> [1] - >>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> >>>>>>>>>>>>> Alfonso Nishikawa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> El lun., 10 jun. 2019 a las 21:07, Alfonso Nishikawa (< >>>>>>>>>>>>> alfonso.nishik...@gmail.com>) escribió: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, John. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you! >>>>>>>>>>>>>> Things I have seen: >>>>>>>>>>>>>> >>>>>>>>>>>>>> - The version of a maven dependency [1] should go on the >>>>>>>>>>>>>> Dependency Management of the root pom [2]. Same for [3] and from >>>>>>>>>>>>>> there, >>>>>>>>>>>>>> should not set the version there. >>>>>>>>>>>>>> - Set test dependencies' scope to test, at [4] and from there. >>>>>>>>>>>>>> - Set the indentation to 2 spaces for the pom [5] >>>>>>>>>>>>>> - Missing "t" in "localhost" at [6]. >>>>>>>>>>>>>> - Port 13 for Kudu? That is "Daytime Protocol" RFC 867 and >>>>>>>>>>>>>> you will need root permission to run it. The default port for >>>>>>>>>>>>>> kudu is 7051, >>>>>>>>>>>>>> isn't it? >>>>>>>>>>>>>> - I would ask you to add the same functionality to load the >>>>>>>>>>>>>> mapping from configuration as in HBase's store [7] in you >>>>>>>>>>>>>> KuduStore [8]. >>>>>>>>>>>>>> This will have implications on your readMapping at [9], so take >>>>>>>>>>>>>> a look at >>>>>>>>>>>>>> the one for HBase at [10] >>>>>>>>>>>>>> - I know it is in other backends, but avoid RuntimeExceptions >>>>>>>>>>>>>> (at least in Java since we have the checked ones) like in [11]. >>>>>>>>>>>>>> You can >>>>>>>>>>>>>> wrap them in GoraException. An example is [12] >>>>>>>>>>>>>> >>>>>>>>>>>>>> And nothing more :) >>>>>>>>>>>>>> Keep going, good job. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] - >>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L98 >>>>>>>>>>>>>> [2] - >>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/pom.xml#L890 >>>>>>>>>>>>>> [3] - >>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L121 >>>>>>>>>>>>>> [4] - >>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml#L180 >>>>>>>>>>>>>> [5] - >>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/pom.xml >>>>>>>>>>>>>> [6] - >>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/test/resources/gora.properties#L18 >>>>>>>>>>>>>> [7] - >>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L92 >>>>>>>>>>>>>> [8] - >>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/store/KuduStore.java#L53 >>>>>>>>>>>>>> [9] - >>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L81 >>>>>>>>>>>>>> [10] - >>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L822 >>>>>>>>>>>>>> [11] - >>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/GORA-485/gora-kudu/src/main/java/org/apache/gora/kudu/mapping/KuduMappingBuilder.java#L141 >>>>>>>>>>>>>> [12] - >>>>>>>>>>>>>> https://github.com/jhnmora000/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L268 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Alfonso Nishikawa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> El sáb., 8 jun. 2019 a las 20:26, John Mora (< >>>>>>>>>>>>>> jhnmora...@gmail.com>) escribió: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi all. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have just updated my weekly reports on Cwiki [1]. This >>>>>>>>>>>>>>> next week I think I should be focusing on the create schema >>>>>>>>>>>>>>> operation and >>>>>>>>>>>>>>> solving the issue of the partitioning configurations in the >>>>>>>>>>>>>>> mapping file. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please let me know if you have suggestions, my last commits >>>>>>>>>>>>>>> are available here [2] >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/GORA/GORA-485+Apache+Kudu+datastore+for+Gora+Reports >>>>>>>>>>>>>>> [2] https://github.com/jhnmora000/gora/tree/GORA-485 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>> John >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>