Hello Tom, Steven just submitted a patch for a Hive Serde storage engine. I believe he successfully was able to read sequence file with this technique. We will be adding a native reader in the future (for improved performance), but for now this should be a decent way to get sequence file data into drill. He currently has the patch up for review, so if you are comfortable applying a patch, building the project and trying to read some of your data we would certainly appreciate feedback. It should be merged with mainline in the near future, which would remove the need to apply the patch.
https://reviews.apache.org/r/17833/ -Jason Altekruse On Fri, Feb 7, 2014 at 7:51 AM, Sebastian Schelter <[email protected]> wrote: > There's no need to excuse for asking questions :) > > > On 02/07/2014 02:49 PM, Tom Kiley wrote: > >> Hello, >> >> >> Are there plans to support Hadoop's Sequence File ( >> http://wiki.apache.org/hadoop/SequenceFile.) Or are they already >> supported >> and I missed it? I could see this being useful to use Drill on the output >> of MapReduce jobs. >> >> The sequence files I have are currently all NULL keys and JSON objects as >> the value. Does anyone have a recommendation on converting to JSON files >> or Parquet files for Drill? The JSON objects are generally the same >> format, but there may be some outliers with differences. Some fields may >> be non-existant in some objects. >> >> >> Thanks, >> Tom >> >> P.S. Apologies for the noob questions. I've just started looking at >> Drill. >> >> >
