>From what I understand, all Spark requires is an InputFormat class, which Hypertable already provides. We've got Input format classes for both MapReduce and Streaming MapReduce:
mapreduce/InputFormat.java <https://github.com/hypertable/hypertable/blob/master/src/java/MapReduce/org/hypertable/hadoop/mapreduce/InputFormat.java> mapred/TextTableInputFormat.java <https://github.com/hypertable/hypertable/blob/master/src/java/MapReduce/org/hypertable/hadoop/mapred/TextTableInputFormat.java> (streaming) As far as coprocessors go, we're probably not going to implement them anytime soon. We plan to build GROUP BY directly into the Hypertable query language. - Doug On Wed, Aug 13, 2014 at 8:39 AM, Rajesh Bhardwaj < [email protected]> wrote: > Hello Doug, > Any chance of getting Spark Connector along with release 1 and coprocessor > (for doing group by ) > > regards > > > On Wednesday, August 13, 2014 8:43:00 PM UTC+5:30, Doug Judd wrote: > >> This is something that will get addressed soon. We plan to support the >> introduction of user defined (custom) data types. Originally this was >> scheduled for 2.0, but it's looking like we may include this in the 1.0 >> release. >> >> - Doug >> >> >> >> >> On Tue, Aug 12, 2014 at 1:22 PM, Dorian Hoxha <[email protected]> >> wrote: >> >>> Hi Doug, >>> Congrats on the new release. 1 question. I looked (the code before this >>> release) about the possibility of implementing some new cell types and saw >>> that the code for: >>> >>> 1. creating schema >>> 2. schema validation >>> 3. allowed values >>> 4. how stuff are saved in cellcache/accessgroup >>> 5. how stuff are merged on read/mergescanner or compaction >>> 6. basically everything >>> >>> was distributed over many files and a little hard to follow on several >>> files (i compared the COUNTER cells, there were many if_cell_is_counter() >>> function calls in many files). >>> >>> Is it possible to implement a class CELL() with different methods and >>> attributes, so we don't have to change/hack hypertable to implement new >>> cell types ? >>> That way a new cell_type will possibly be just a new file with a class >>> that inherits this CELL() base class ? Or is this way slow? >>> Still values will be returned as bytes to the client (didn't see the >>> thrift-code) ? >>> And I think it will be easier this way for developers to implement new >>> cell types? >>> Or will this be done for the 2.0 roadmap item Data Types? >>> >>> Thanks >>> >>> >>> On Mon, Jun 23, 2014 at 8:19 PM, Doug Judd <[email protected]> wrote: >>> >>>> Hello, >>>> >>>> Now that Hypertable version 0.9.8.0 is out the door, I'd like to point >>>> out some significant changes that have gone into the release. These >>>> changes are described below. >>>> >>>> *1. *Default port numbers have changed from 380XX to 1586X. The >>>> reason for this is that on some Linux systems, the ephemeral port range >>>> goes from 32768-65535 which was causing startup problems due to port >>>> conflicts. >>>> >>>> *2. *Improved secondary index support. The new secondary index >>>> support has been vastly improved. You can read all about it in User >>>> Guide - Secondary Indices >>>> <http://hypertable.com/documentation/user_guide/#secondary-indices>. >>>> >>>> *3. *All timestamps passed through the Hypertable APIs now undergo >>>> localtime conversion. >>>> >>>> *4. *Added ability to add and remove secondary indices with the ALTER >>>> TABLE >>>> <http://hypertable.com/documentation/reference_manual/hql#alter-table> >>>> command. >>>> >>>> *5. *Added a REBUILD INDICES >>>> <http://hypertable.com/documentation/reference_manual/hql#rebuild-indices> >>>> command. >>>> >>>> *6. *Improved schema, access group, and column family specifications >>>> by making them more uniform. Changed the semantics of the table_alter >>>> <http://hypertable.com/documentation/reference_manual/thrift_api/#tablealter> >>>> to accept a schema object returned by table_get_schema >>>> <http://hypertable.com/documentation/reference_manual/thrift_api/#tablegetschema> >>>> and >>>> then modified as desired. >>>> >>>> *7. *Added a Developer Guide >>>> <http://hypertable.com/documentation/developer_guide/> to the >>>> Hypertable website. This guide illustrates how to build thrift client >>>> programs, exercising the various APIs, in all supported languages (Perl >>>> TBD). The guide is now auto-generated from a system test, so all of the >>>> code examples will remain valid and will compile and run and will not go >>>> stale over time. >>>> >>>> *8. *The query cache now invalidates on row+column_family instead of >>>> just the row. This will improve the performance of read heavy applications >>>> that use multiple column families. >>>> >>>> *9. *Upgraded Thrift to > 0.9.1 >>>> >>>> - Doug >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Hypertable User" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> >>>> Visit this group at http://groups.google.com/group/hypertable-user. >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Hypertable Development" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> >>> Visit this group at http://groups.google.com/group/hypertable-dev. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> >> -- >> Doug Judd >> CEO, Hypertable Inc. >> > -- Doug Judd CEO, Hypertable Inc. -- You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/hypertable-dev. For more options, visit https://groups.google.com/d/optout.
