Hi,

First up I would like to say I’m really excited by the Blur project, it
seems to fit the need of a potential project perfectly. I’m hoping that I
can someday contribute back to this project in some way as it seems that it
will be of enormous help to me.

Now, on to the meat of the issue. I’m a complete search newbie. I am coming
from a Spring/Application development background but have to get involved
in the Search/Big data field for a current client. Since the new year I
have been looking at Hadoop and have setup a small cluster using Cloudera’s
excellent tools. I’ve been downloading datasets, running MR jobs, etc. and
think I have gleaned a very basic level of knowledge which is enough for me
to learn more when I need it. This week I have started looking at Blur, and
at present I have cloned the src to the hadoop namenode where I have built
and started the blur servers. But now I am stuck, and don’t know where to
go. So I will ask the following

1 - /apache-blur-0.2.0-SNAPSHOT/conf/servers. At present I just have my
namenode defined in here. Do I need to add my datanodes as well?

2 - blur> create repl-table hdfs://localhost:9000/blur/repl-table 1
java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
connection exception: java.net.ConnectException: Connection refused.

I’m confused here. Is 9000 the correct port? Is there some sort of user
auth issue?

3 - Assuming I create a table on the hdfs, when I want to import my data
into it I use a MR job yes? What is the best way to package this job? Do I
have to include all the Blur jars or do I install Blur on the datanodes and
set a classpath? Is it possible to link to an example MR job in a maven
project? Or am I on completely the wrong track.

Thanks for your help,

Paul.

Reply via email to