Re: East coast bigtop hackday/microtalks; Any interest?

2013-11-20 Thread Artem Ervits
if its live streamed, ill forward the feed to some folks.


Artem Ervits
Data Analyst
New York Presbyterian Hospital

From: Jay Vyas [mailto:jayunit...@gmail.com]
Sent: Wednesday, November 20, 2013 08:15 AM
To: user@bigtop.apache.org user@bigtop.apache.org
Subject: Re: East coast bigtop hackday/microtalks; Any interest?

artem:  We have the hartford scalable computation meetup that i run.
should we do the first one in stamford?  thats about half way between and im 
more than happy to coordinate.


On Wed, Nov 20, 2013 at 12:03 AM, Konstantin Boudnik 
c...@apache.orgmailto:c...@apache.org wrote:
And as I said before - we'll do hangout video feed to simulate a presence
effect ;) I will send the URL once we get closer to the event

Cos

On Sun, Nov 17, 2013 at 06:14PM, Roman Shaposhnik wrote:
 On Tue, Nov 12, 2013 at 8:55 AM, Jay Vyas 
 jayunit...@gmail.commailto:jayunit...@gmail.com wrote:
  Hi folks.  Is anyone interest in attending a bigtop meetup either in
  connecticut or massachusets to coincide with the one Kons is planning in
  california?
 
  Im sure some folks around here are using bigtop either for testing or
  development... maybe in the new york or boston areas?
 
  im in hartford so either location is valid for me... connecticut also would
  work (of course) as its central to both. :)

 Would be really nice to see an East Coast Bigtop meetup.

 In general, for all the remote folks -- we typically do have quite
 a lively IRC presence on #bigtop during our meetups/hackathons.
 It is not quite as being in the same room, but way better than
 nothing.

 Thanks,
 Roman.



--
Jay Vyas
http://jayunit100.blogspot.com

This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged.  If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited.  If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message.  Thank you.


Re: Getting Started Guide? (and some installation issues)

2013-11-20 Thread ivaylo frankov
Hi Steve,

Can you send me your private email and I will be able to send you my
configuration up to now.

HBase , partly Hive , partly Pig (not much but still something ;))
I think that experienced specialist needs some hours to configure bigtop
(at max) Newbee like me needs some days at least ;).
I want to start pseudo distributed node with hbase pig giraph solr mahout
and hue. Flume is also interesting.
What is your desire ?

Best regards,
Ivo
P.S: if you want spark be very carefully by installing it : You have to
give repository or you may be receive some Spark language at least that was
the case in ubuntu.

Am Mittwoch, 20. November 2013 schrieb i.frankov :

 Hi Stive,
 I am not at home
 I will send you my status tonight

 Cheers,
 Ivo


 Von Samsung Mobile gesendet



Re: Getting Started Guide? (and some installation issues)

2013-11-20 Thread Steven Núñez
Hi Ivo,

I’m also working, to start with, on a pseudo distributed node. I’ve got a 
CentOS 6.x Amazon EC2 instance and using FreeNX for the remote desktop. So far 
there seems to be several services that aren’t started after the BigTop 
install. Here’s the output of ‘service —status-all'

Flume NG agent is running  [  OK  ]
Hadoop datanode is running [  OK  ]
Hadoop journalnode is running  [  OK  ]
Hadoop namenode is running [  OK  ]
Hadoop secondarynamenode is running[  OK  ]
Hadoop zkfc is dead and pid file exists[FAILED]
Hadoop httpfs is running   [  OK  ]
Hadoop historyserver is dead and pid file exists   [FAILED]
Hadoop nodemanager is dead and pid file exists [FAILED]
Hadoop proxyserver is dead and pid file exists [FAILED]
Hadoop resourcemanager is running  [  OK  ]
hald (pid  1031) is running...
HBase master daemon is dead and pid file exists[FAILED]
hbase-regionserver is not running.
HBase rest daemon is running   [  OK  ]
HBase thrift daemon is running [  OK  ]
HCatalog server is running [  OK  ]
Hive Metastore is dead and pid file exists [FAILED]
Hive Server is running [  OK  ]
Hive Server2 is dead and pid file exists   [FAILED]
WEBHCat server is running  [  OK  ]
…
Spark master is not running[FAILED]
Spark worker is not running[FAILED]
spice-vdagentd is stopped
Sqoop Server is running[  OK  ]

Am I correct in assuming that all of these services should be working properly 
after a BigTop install? As far as the individual components go, I’ve been 
working through the Running various Bigtop 
componentshttps://cwiki.apache.org/confluence/display/BIGTOP/Running+various+Bigtop+components
 wiki page. Pig and HBase are working so far, but Hive fails:

hive create table doh(id int);
FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask

This is after changing permissions to be world writable on:

  *   /tmp
  *   /user/hive/warehouse (actually, BigTop set that permission correctly)
  *   /var/lib/hive/metastore/metastore_db/ (CentOS filesystem)

Any ideas? I notice that Hive Server 2 'is dead and pid file exists’; looks 
like a good place to start.


Regards,
- Steve


--
Illation Pty Ltd
8/350 Collins Street
Melbourne 3000

T:   +61 3 8399 9442 x100
M: +61 4 0096 4240


From: ivaylo frankov i.fran...@googlemail.commailto:i.fran...@googlemail.com
Reply-To: user@bigtop.apache.orgmailto:user@bigtop.apache.org 
user@bigtop.apache.orgmailto:user@bigtop.apache.org
Date: Thursday, 21 November 2013 3:04
To: user@bigtop.apache.orgmailto:user@bigtop.apache.org 
user@bigtop.apache.orgmailto:user@bigtop.apache.org
Subject: Re: Getting Started Guide? (and some installation issues)

Hi Steve,

Can you send me your private email and I will be able to send you my 
configuration up to now.

HBase , partly Hive , partly Pig (not much but still something ;))
I think that experienced specialist needs some hours to configure bigtop (at 
max) Newbee like me needs some days at least ;).
I want to start pseudo distributed node with hbase pig giraph solr mahout and 
hue. Flume is also interesting.
What is your desire ?

Best regards,
Ivo
P.S: if you want spark be very carefully by installing it : You have to give 
repository or you may be receive some Spark language at least that was the case 
in ubuntu.

Am Mittwoch, 20. November 2013 schrieb i.frankov :
Hi Stive,
I am not at home
I will send you my status tonight

Cheers,
Ivo


Von Samsung Mobile gesendet


Hive MetaStore Start Error

2013-11-20 Thread Steven Núñez
Tracing down the Hive Metastore error, it seems that the Thrift Server can’t 
create a socket. Any ideas?

Cheers,
- Steve

2013-11-21 07:08:48,056 ERROR metastore.HiveMetaStore 
(HiveMetaStore.java:main(4242)) - Metastore Thrift Server threw an exception...
org.apache.thrift.transport.TTransportException: Could not create ServerSocket 
on address 0.0.0.0/0.0.0.0:9083.
at 
org.apache.thrift.transport.TServerSocket.init(TServerSocket.java:93)
at 
org.apache.thrift.transport.TServerSocket.init(TServerSocket.java:75)
at 
org.apache.hadoop.hive.metastore.TServerSocketKeepAlive.init(TServerSocketKeepAlive.java:34)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:4282)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:4239)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)



From: ivaylo frankov i.fran...@googlemail.commailto:i.fran...@googlemail.com
Reply-To: user@bigtop.apache.orgmailto:user@bigtop.apache.org 
user@bigtop.apache.orgmailto:user@bigtop.apache.org
Date: Thursday, 21 November 2013 3:04
To: user@bigtop.apache.orgmailto:user@bigtop.apache.org 
user@bigtop.apache.orgmailto:user@bigtop.apache.org
Subject: Re: Getting Started Guide? (and some installation issues)

Hi Steve,

Can you send me your private email and I will be able to send you my 
configuration up to now.

HBase , partly Hive , partly Pig (not much but still something ;))
I think that experienced specialist needs some hours to configure bigtop (at 
max) Newbee like me needs some days at least ;).
I want to start pseudo distributed node with hbase pig giraph solr mahout and 
hue. Flume is also interesting.
What is your desire ?

Best regards,
Ivo
P.S: if you want spark be very carefully by installing it : You have to give 
repository or you may be receive some Spark language at least that was the case 
in ubuntu.

Am Mittwoch, 20. November 2013 schrieb i.frankov :
Hi Stive,
I am not at home
I will send you my status tonight

Cheers,
Ivo


Von Samsung Mobile gesendet


Re: Getting Started Guide? (and some installation issues)

2013-11-20 Thread Sean Mackrory
 Is there a ‘getting started’ guide?

Beyond just installation, most of our documentation is very
developer-centric, I'm afraid. What there is can be found on our wiki:
https://cwiki.apache.org/confluence/display/BIGTOP/Index

 Something that will describe the filesystem and configuration file
conventions?

Bigtop is a distribution of other open-source projects, so there is no
single configuration system. The file conventions will vary from project to
project, however Bigtop does not modify much about how the configuration
files work, so I would refer you to the upstream projects for details of
their configuration files (eg. http://hadoop.apache.org,
http://hbase.apache.org)

 In particular the existence of these conf.empty directories is confusing.

The conf.dist and conf.empty directories provide some default or template
configuration files. You should create a directory at the same level for
your own configuration. Perhaps conf.steven. There is a symlink for each
component at /etc/component/conf. This symlink, through a system called
alternatives, eventually points to the currently active configuration for
that component. Once you have modified the configuration to suit your
needs, you can make it the active configuration using the alternatives
command. See here for it's documentation:
http://linux.about.com/library/cmd/blcmdl8_alternatives.htm. For example,
if you look at the /etc/hadoop/conf symlink, you will probably find that it
points to /etc/alternatives/hadoop-conf. You can see how the alternatives
are configured and point the configuration to your new folder like this:

alternatives --display hadoop-conf
alternatives --set hadoop-conf /etc/hadoop/conf.steven

 Is Hue supposed to be configured separately, or is BigTop supposed to do
that?

As I recall, the misconfigurations that are reported at startup are things
like services not running (like Oozie, etc.) Once you configure and start
those services, these warnings should disappear. For other warnings, post
them here and we'll see if we can help you.

 What is the target time to set-up a Hadoop installation via BigTop?

Not sure what to tell you here. I regularly set up pseudo-distributed
Hadoop installations in minutes with little more than yum install
hadoop-conf-pseudo, sudo service hadoop-hdfs-namenode init and a reboot.
If you're using a bunch of other services on a fully-distributed cluster
and you're completely new to this, I would expect it take hours / days to
get everything running. Bigtop also maintains puppet code that will
configure everything with a pretty good default configuration and have your
cluster working pretty much out-of-the-box. Maybe this is a good option for
you?

 Can you send me your private email and I will be able to send you my
configuration up to now.

As I mentioned, our documentation is very developer-centric, and as Steven
is showing, some user-centric documentation would be a huge help to the
community. Could I persuade you to share what you've learned on the mailing
list, or perhaps on the wiki so others can benefit?


On Wed, Nov 20, 2013 at 11:04 AM, ivaylo frankov
i.fran...@googlemail.comwrote:

 Hi Steve,

 Can you send me your private email and I will be able to send you my
 configuration up to now.

 HBase , partly Hive , partly Pig (not much but still something ;))
 I think that experienced specialist needs some hours to configure bigtop
 (at max) Newbee like me needs some days at least ;).
 I want to start pseudo distributed node with hbase pig giraph solr mahout
 and hue. Flume is also interesting.
 What is your desire ?

 Best regards,
 Ivo
 P.S: if you want spark be very carefully by installing it : You have to
 give repository or you may be receive some Spark language at least that was
 the case in ubuntu.

 Am Mittwoch, 20. November 2013 schrieb i.frankov :

 Hi Stive,
 I am not at home
  I will send you my status tonight

 Cheers,
 Ivo


 Von Samsung Mobile gesendet




FW: Hive Clues

2013-11-20 Thread Steven Núñez
Gents,

Happy to discuss the problems and resolutions here so there’s a community 
record. Below is a few clues on the Hive problems.

- SteveN

From: Steven Nunez steven.nu...@illation.commailto:steven.nu...@illation.com
Date: Thursday, 21 November 2013 9:32
To: i.fran...@googlemail.commailto:i.fran...@googlemail.com 
i.fran...@googlemail.commailto:i.fran...@googlemail.com
Subject: Hive Clues

Hi Ivo,

I don’t know how far you got with Hive, but I have found some clues as to why 
my configuration isn’t working. I sent a message to the group at large 
regarding some socket bind errors. After some searching, I found a post on Hive 
2 
installationhttp://danieladeniji.wordpress.com/2013/05/24/technical-hadoopcloudera-cdhhive-v2-installation/
 that suggests /etc/hive/conf/hive-site.xml needs a hive.metastore.uris 
property entry. This didn’t work for me, but I’ll try a reboot and some other 
measures to be sure.

Have you worked out what these conf.empty, conf.dist, conf.* directories are 
for? Is there a convention for their usage?

Regards,
- SteveN


CentOS Out of Box Install Summary

2013-11-20 Thread Steven Núñez
Gents,

Below is a summary of the results of an out of the box CentOS/EC2 BigTop 0.70.0 
install. It lists all the components I need for the project I’m writing about. 
What would be useful somewhere on the wiki is a list of known issues and a page 
to some possible resolutions. This could be as easy as taking this list and 
adding a third column ‘workaround’ with a page on how to fix it. It could also 
be used as a QA page of sorts, on the assumption that all of the components are 
supposed to work out of the box (looks like some of the init.d scripts aren’t 
quite right either judging by the error below).

Cheers,
- SteveN

Hadoop datanode is running [  OK  ]
Hadoop journalnode is running  [  OK  ]
Hadoop namenode is running [  OK  ]
Hadoop secondarynamenode is running[  OK  ]
Hadoop zkfc is dead and pid file exists[FAILED]
Hadoop httpfs is running   [  OK  ]
Hadoop historyserver is dead and pid file exists   [FAILED]
Hadoop nodemanager is dead and pid file exists [FAILED]
Hadoop proxyserver is dead and pid file exists [FAILED]
Hadoop resourcemanager is running  [  OK  ]
hald (pid  1041) is running...
HBase master daemon is dead and pid file exists[FAILED]
hbase-regionserver is not running.
HBase rest daemon is running   [  OK  ]
HBase thrift daemon is running [  OK  ]
HCatalog server is running [  OK  ]
Hive Metastore is dead and pid file exists [FAILED]
Hive Server is running [  OK  ]
Hive Server2 is dead and pid file exists   [FAILED]
not running but /var/run/oozie/oozie.pid exists.
Spark master is not running[FAILED]
Spark worker is not running[FAILED]
spice-vdagentd is stopped
Sqoop Server is running[  OK  ]



Re: Getting Started Guide? (and some installation issues)

2013-11-20 Thread Steven Núñez
Is this puppet code in the install? If so, where? I’m thinking I should perhaps 
start working with a src version of BigTop, but I was really hoping to spend as 
little time as possible on configuring my Hadoop ecosystem and maximize problem 
solving. Seems I keep getting deeper into the rabbit warren.


From: Sean Mackrory mackror...@gmail.commailto:mackror...@gmail.com
Reply-To: user@bigtop.apache.orgmailto:user@bigtop.apache.org 
user@bigtop.apache.orgmailto:user@bigtop.apache.org
Date: Thursday, 21 November 2013 10:10
To: user@bigtop.apache.orgmailto:user@bigtop.apache.org 
user@bigtop.apache.orgmailto:user@bigtop.apache.org
Subject: Re: Getting Started Guide? (and some installation issues)

Bigtop also maintains puppet code that will configure everything with a pretty 
good default configuration and have your cluster working pretty much 
out-of-the-box. Maybe this is a good option for you?


Probably Bugs in Hive Install

2013-11-20 Thread Steven Núñez
Gents,

The summary of this is: the system is reporting metastore not running (nor hive 
server 2), yet hive is now working. See below email where I was going to 
request diagnoses help, only to discover that the example now works, for 
reasons unknown.

Cheers,
- SteveN


-- Original Message 
-
I’ve got a bit closer. The metastore and hive server aren’t running:

org.apache.thrift.transport.TTransportException: Could not create ServerSocket 
on address 0.0.0.0/0.0.0.0:9083
org.apache.thrift.transport.TTransportException: Could not create ServerSocket 
on address 0.0.0.0/0.0.0.0:1

Because something is already bound to those ports (from netstat):

tcp0  0 0.0.0.0:1   0.0.0.0:*   
LISTEN  3071/java
tcp0  0 0.0.0.0:90830.0.0.0:*   
LISTEN  2827/java

Whatever is listening on those ports doesn’t speak HTTP, and 1 just closes 
a telnet connection straight away.

Grepping through /etc doesn’t produce much of value, or anything I don’t 
already know:

nunez$ grep -R 9083 *
alternatives/hive-conf/hive-site.xml:  valuethrift://localhost:9083/value
grep: alternatives/oozie-tomcat-conf/webapps/oozie/ext-2.2: No such file or 
directory
default/hive-hcatalog-server:export METASTORE_PORT=9083
hive/conf.dist/hive-site.xml:  valuethrift://localhost:9083/value
hive/conf/hive-site.xml:  valuethrift://localhost:9083/value
grep: oozie/tomcat-deployment.http/webapps/oozie/ext-2.2: No such file or 
directory
grep: oozie/tomcat-deployment.https/webapps/oozie/ext-2.2: No such file or 
directory
grep: oozie/tomcat-deployment/webapps/oozie/ext-2.2: No such file or directory

I modified the hive-site.xml file based on a ‘blog post I found earlier, but 
the port was bound even before that change.

So, does anyone have any ideas about who is listening to these ports and why? I 
would have thought them reserved for Hive, but some Java program thinks 
otherwise. This of course is assuming that my Hive failure is caused by 
something related:

STOP — THIS IS STRANGE.

So, what was going to appear here is the Hive failure I reported 
yesterdayhttp://mail-archives.apache.org/mod_mbox/bigtop-user/201311.mbox/%3cceb36003.217ff%25steve.nu...@illation.com%3e.
 However upon repeating the commands, it now works:

hive create table doh(id int);
OK
Time taken: 5.344 seconds

So, I guess the question now is: Why is this working when ‘service —status-all’ 
reports that the services are failed, and there’s the above two socket errors 
in the log files?


Cheers,
- SteveN

P.S. Interesting that grep is reporting some kind of ooze filesystem errors. 
Oozie is also failing, but I haven’t got around to looking at it yet. I wonder 
if these errors are part of the reason though.


0.70.0 Mahout Examples

2013-11-20 Thread Steven Núñez
Anyone got the Mahout examples running? The ones from the Running Various 
Bigtop Components 
pagehttps://cwiki.apache.org/confluence/display/BIGTOP/Running+various+Bigtop+components?

I get:

/usr/lib/hadoop-hdfs/bin/hdfs: line 34: 
/usr/lib/hadoop-hdfs/bin/../libexec/hdfs-config.sh: No such file or directory
/usr/lib/hadoop-hdfs/bin/hdfs: line 150: cygpath: command not found
/usr/lib/hadoop-hdfs/bin/hdfs: line 191: exec: : not found

Why is this looking for a cygpath?

All the other Mahout examples seem to work.

- Steve


Re: Best Way to Determine Package Versions

2013-11-20 Thread Doug Chang
I look in bigtop.mk after uncompressing the source tarball or zip file
which corresponds to the version I am using.


On Wed, Nov 20, 2013 at 8:44 PM, Steven Núñez steven.nu...@illation.comwrote:

   Gents,

  What is the best way to determine the particular version of  a BigTop
 package? A command for this would be very useful. This particular use case
 involves trying out the Oozie component according to the Running Various
 BigTop 
 Componentshttps://cwiki.apache.org/confluence/display/BIGTOP/Running+various+Bigtop+componentswiki
  page. The instructions for Oozie appear to be out of date, and the Oozie
 website http://oozie.apache.org has different configurations for
 different versions.

  It seems that the Oozie instructions on that page are particularly out
 of date.

  Regards,
 - SteveN



Re: Best Way to Determine Package Versions

2013-11-20 Thread Bruno Mahé

On 11/20/2013 08:44 PM, Steven Núñez wrote:

Gents,

What is the best way to determine the particular version of  a BigTop
package? A command for this would be very useful. This particular use
case involves trying out the Oozie component according to the Running
Various BigTop Components
https://cwiki.apache.org/confluence/display/BIGTOP/Running+various+Bigtop+components
wiki page. The instructions for Oozie appear to be out of date, and the
Oozie website http://oozie.apache.org has different configurations for
different versions.

It seems that the Oozie instructions on that page are particularly out
of date.

Regards,
- SteveN


I assume you are referring to an installed package

On a RPM based GNU/Linux distribution:
rpm -qi package name
rpm -qf file or directory

Note that you can also use yum (yum info, yum list).


for deb based distributions, I don't have one handy right now, but the 
following link should help you in that regard:

http://www.debian.org/doc/manuals/debian-faq/ch-pkgtools.en.html


Thanks,
Bruno