Re: Setting up a cluster

Alexandre BECHE Mon, 22 Jul 2013 05:28:56 -0700

Hi everybody,

I started to dump a few data into my cluster and play a bit with the
reference interpreter.
Today my dump is just a json file stored on my local FS, could I put it in
HDFS? Is the HDFS Scanner already available?


I understood that the reference interpreter is used for testing purpose
only, what would be the next step for me to get a full drill query on my
dataset?

Thanks for your help,
Cheers,
Alex


On Wed, Jul 17, 2013 at 9:47 AM, Alexandre BECHE
<[email protected]>wrote:

> Thanks for your answer.
>
> But in the meantime, does it make sense to start deploying drillbit and
> get the first steps working?
> Which deployment model would you recommend (single machine or cluster
> mode).
>
> Cheers,
> Alex
>
>
> On Tue, Jul 16, 2013 at 11:33 PM, Michael Hausenblas <
> [email protected]> wrote:
>
>>
>> > As far as I understood, this webUI
>> http://srvgal85.deri.ie/apache-drill/ is
>> > currently working with elasticsearch (does it mean it will be supported
>> in
>> > the future?).
>>
>>
>> Note that the above URL was just a demo deployment. As soon as DRILL-77
>> [1], the REST API, is resolved (or close to) I'll port the UI over to it.
>> ES was just necessary back then (Oct last year IIRC) as no backend existed.
>>
>> You can track the progress re the WebUI by subscribing to DRILL-58 [2]. I
>> would think that Hari soon has a breakthrough with the REST API and this
>> means I should get around to this also rather soonish.
>>
>> Hari, any thoughts re timing?
>>
>>
>> Cheers,
>>                 Michael
>>
>> [1] https://issues.apache.org/jira/browse/DRILL-77
>> [2] https://issues.apache.org/jira/browse/DRILL-58
>>
>> --
>> Michael Hausenblas
>> Ireland, Europe
>> http://mhausenblas.info/
>>
>> On 16 Jul 2013, at 22:26, Alexandre BECHE <[email protected]>
>> wrote:
>>
>> > Hi everybody,
>> >
>> > I finally got the cluster ready and almost working properly. Now, I
>> hope to
>> > start the data acquisition by the end of the week. As I understood that
>> the
>> > JSON scanner is much more advanced than the HBase one, I will start
>> using
>> > it.
>> >
>> > Now I have a few more questions, maybe dedicated to Michael.
>> > As far as I understood, this webUI
>> http://srvgal85.deri.ie/apache-drill/ is
>> > currently working with elasticsearch (does it mean it will be supported
>> in
>> > the future?). My main question is: how far am I from using it with my
>> own
>> > json data stored in HDFS (ie. using the full stack) and what would be
>> the
>> > first step?
>> >
>> > Thanks for your help,
>> > Cheers,
>> > Alex
>> >
>> >
>> >
>> > On Thu, Jul 11, 2013 at 2:36 AM, Ted Dunning <[email protected]>
>> wrote:
>> >
>> >> Even that dependency change is often not necessary.  If your original
>> >> dependency looks like this:
>> >>
>> >>    <dependency>
>> >>      <groupId>org.apache.hadoop</groupId>
>> >>      <artifactId>hadoop-core</artifactId>
>> >>      <version>1.1.0</version>
>> >>      <scope>provided</scope>
>> >>    </dependency>
>> >>
>> >> then your code should work with either Apache Hadoop or MapR.  All you
>> have
>> >> to do is make sure the jars for the distro you have are in the
>> classpath
>> >> correctly.  This does not package the Hadoop jars into your executable
>> >> which can either be a virtue or a vice depending on your requirements.
>> >>
>> >>
>> >>
>> >> On Wed, Jul 10, 2013 at 2:21 PM, Jacques Nadeau <[email protected]>
>> >> wrote:
>> >>
>> >>> Ted gave a good long answer.  The short answer is that you'll need to
>> >>> change the dependency on Hadoop to use the MapR distribution which
>> >>> means changing:
>> >>>
>> >>> the following entry in two files files:
>> >>> sandbox/prototype/exec/java-exec/pom.xml and
>> >>> sandbox/prototype/exec/ref/pom.xml
>> >>>
>> >>>      <groupId>org.apache.hadoop</groupId>
>> >>>      <artifactId>hadoop-core</artifactId>
>> >>>      <version>1.1.0</version>
>> >>>
>> >>> to: (note the change in version number)
>> >>>
>> >>>  <groupId>org.apache.hadoop</groupId>
>> >>>  <artifactId>hadoop-core</artifactId>
>> >>>  <version>1.0.3-mapr-2.1.3.1</version>
>> >>>
>> >>> and adding the mapr repository to the list of available repositories:
>> >>>
>> >>>  <repository>
>> >>>      <id>mapr-releases</id>
>> >>>      <url>http://repository.mapr.com/maven/</url>
>> >>>      <snapshots><enabled>false</enabled></snapshots>
>> >>>      <releases><enabled>true</enabled></releases>
>> >>>  </repository>
>> >>>
>> >>>
>> >>> As an Apache initiative, the goal of Drill is to work on all Hadoop
>> >>> distributions.
>> >>>
>> >>> Jacques
>> >>>
>> >>>
>> >>>
>> >>> On Wed, Jul 10, 2013 at 6:22 AM, Alexandre BECHE
>> >>> <[email protected]> wrote:
>> >>>> Dear drill dev,
>> >>>>
>> >>>> As discussed yesterday during the Hangout, I am currently setting up
>> a
>> >>>> cluster using the M3 distribution.
>> >>>> I went through the MapR documentation for the installation and I
>> found
>> >>> that
>> >>>> there is no Namenodes but an HDFS compliant API. What are the impact
>> on
>> >>>> DRILL for that? Is DRILL compatible for both system (native HDFS and
>> >> MapR
>> >>>> custom HDFS)?
>> >>>>
>> >>>> Cheers,
>> >>>> Alex
>> >>>
>> >>
>>
>>
>

Re: Setting up a cluster

Reply via email to