RE: Hive Installation Problem

2010-02-05 Thread baburaj . S
I have tried the same but still the installation is giving the same error. I 
don't know if it is looking in the cache . Can we make any change in 
ivysettings.xml that it has to resolve the file from the file system rather 
through an url.

Babu


-Original Message-
From: Zheng Shao [mailto:zsh...@gmail.com]
Sent: Friday, February 05, 2010 12:47 PM
To: hive-user@hadoop.apache.org
Subject: Re: Hive Installation Problem

Added to http://wiki.apache.org/hadoop/Hive/FAQ

Zheng

On Thu, Feb 4, 2010 at 11:11 PM, Zheng Shao zsh...@gmail.com wrote:
 Try this:

 cd ~/.ant/cache/hadoop/core/sources
 wget 
 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz


 Zheng

 On Thu, Feb 4, 2010 at 10:23 PM, baburaj.S babura...@onmobile.com wrote:
 Hello ,

 I am new to Hadoop and is trying to install Hive now. We have the following 
 setup at our side

 OS - Ubuntu 9.10
 Hadoop - 0.20.1
 Hive installation tried - 0.4.0 .

 The Hadoop is installed and is working fine . Now when we were installing 
 Hive I got error that it couldn't resolve the dependencies. I changed the 
 shims build and properties xml to make the dependencies look for Hadoop 
 0.20.1 . But now when I call the ant script I get the following error

 ivy-retrieve-hadoop-source:
 [ivy:retrieve] :: Ivy 2.0.0-rc2 - 20081028224207 :: 
 http://ant.apache.org/ivy/ :
 :: loading settings :: file = /master/hive/ivy/ivysettings.xml
 [ivy:retrieve] :: resolving dependencies :: 
 org.apache.hadoop.hive#shims;working
 [ivy:retrieve]  confs: [default]
 [ivy:retrieve] :: resolution report :: resolve 953885ms :: artifacts dl 0ms
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
-
|  default |   1   |   0   |   0   |   0   ||   0   |   0   |
-
 [ivy:retrieve]
 [ivy:retrieve] :: problems summary ::
 [ivy:retrieve]  WARNINGS
 [ivy:retrieve]  module not found: hadoop#core;0.20.1
 [ivy:retrieve]   hadoop-source: tried
 [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
 [ivy:retrieve]
 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]   apache-snapshot: tried
 [ivy:retrieve]
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
 [ivy:retrieve]
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]   maven2: tried
 [ivy:retrieve]
 http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
 [ivy:retrieve]
 http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.tar.gz
 [ivy:retrieve]  ::
 [ivy:retrieve]  ::  UNRESOLVED DEPENDENCIES ::
 [ivy:retrieve]  ::
 [ivy:retrieve]  :: hadoop#core;0.20.1: not found
 [ivy:retrieve]  ::
 [ivy:retrieve]  ERRORS
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.tar.gz
 [ivy:retrieve]
 [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS

 BUILD FAILED
 /master/hive/build.xml:148: The following error occurred while executing 
 this line:
 /master/hive/build.xml:93: The following error occurred while executing this 
 line:
 /master/hive/shims/build.xml:64: The following error occurred while 
 executing this line:
 /master/hive/build-common.xml:172: impossible to resolve dependencies:
resolve failed - see output for details

 Total time: 15 minutes 55 seconds


 I have even tried to download hadoop-0.20.1.tar.gz and put it in the ant 
 cache of the user . Still the same error is repeated. I am stuck and not 
 able to install it .

 Any help on the above will be greatly appreciated.

 Babu


 DISCLAIMER: The information in this 

Re: Hive Installation Problem

2010-02-05 Thread Carl Steinbach
Hi Babu,

~/.ant/cache is the default Ivy cache directory for Hive, but if the
environment variable IVY_HOME
is set it will use $IVY_HOME/cache instead. Is it possible that you have
this environment
variable set to a value different than ~/.ant?

On Fri, Feb 5, 2010 at 12:09 AM, baburaj.S babura...@onmobile.com wrote:

 I have tried the same but still the installation is giving the same error.
 I don't know if it is looking in the cache . Can we make any change in
 ivysettings.xml that it has to resolve the file from the file system rather
 through an url.

 Babu


 -Original Message-
 From: Zheng Shao [mailto:zsh...@gmail.com]
 Sent: Friday, February 05, 2010 12:47 PM
 To: hive-user@hadoop.apache.org
 Subject: Re: Hive Installation Problem

 Added to http://wiki.apache.org/hadoop/Hive/FAQ

 Zheng

 On Thu, Feb 4, 2010 at 11:11 PM, Zheng Shao zsh...@gmail.com wrote:
  Try this:
 
  cd ~/.ant/cache/hadoop/core/sources
  wget
 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
 
 
  Zheng
 
  On Thu, Feb 4, 2010 at 10:23 PM, baburaj.S babura...@onmobile.com
 wrote:
  Hello ,
 
  I am new to Hadoop and is trying to install Hive now. We have the
 following setup at our side
 
  OS - Ubuntu 9.10
  Hadoop - 0.20.1
  Hive installation tried - 0.4.0 .
 
  The Hadoop is installed and is working fine . Now when we were
 installing Hive I got error that it couldn't resolve the dependencies. I
 changed the shims build and properties xml to make the dependencies look for
 Hadoop 0.20.1 . But now when I call the ant script I get the following error
 
  ivy-retrieve-hadoop-source:
  [ivy:retrieve] :: Ivy 2.0.0-rc2 - 20081028224207 ::
 http://ant.apache.org/ivy/ :
  :: loading settings :: file = /master/hive/ivy/ivysettings.xml
  [ivy:retrieve] :: resolving dependencies ::
 org.apache.hadoop.hive#shims;working
  [ivy:retrieve]  confs: [default]
  [ivy:retrieve] :: resolution report :: resolve 953885ms :: artifacts dl
 0ms
 
  -
 |  |modules||   artifacts
   |
 |   conf   | number| search|dwnlded|evicted||
 number|dwnlded|
 
  -
 |  default |   1   |   0   |   0   |   0   ||   0   |   0
   |
 
  -
  [ivy:retrieve]
  [ivy:retrieve] :: problems summary ::
  [ivy:retrieve]  WARNINGS
  [ivy:retrieve]  module not found: hadoop#core;0.20.1
  [ivy:retrieve]   hadoop-source: tried
  [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
  [ivy:retrieve]
 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
  [ivy:retrieve]   apache-snapshot: tried
  [ivy:retrieve]
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/core-0.20.1.pom
  [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
  [ivy:retrieve]
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/hadoop-0.20.1.tar.gz
  [ivy:retrieve]   maven2: tried
  [ivy:retrieve]
 http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.pom
  [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
  [ivy:retrieve]
 http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.tar.gz
  [ivy:retrieve]  ::
  [ivy:retrieve]  ::  UNRESOLVED DEPENDENCIES ::
  [ivy:retrieve]  ::
  [ivy:retrieve]  :: hadoop#core;0.20.1: not found
  [ivy:retrieve]  ::
  [ivy:retrieve]  ERRORS
  [ivy:retrieve]  Server access Error: Connection timed out url=
 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
  [ivy:retrieve]  Server access Error: Connection timed out url=
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/core-0.20.1.pom
  [ivy:retrieve]  Server access Error: Connection timed out url=
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/hadoop-0.20.1.tar.gz
  [ivy:retrieve]  Server access Error: Connection timed out url=
 http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.pom
  [ivy:retrieve]  Server access Error: Connection timed out url=
 http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.tar.gz
  [ivy:retrieve]
  [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
 
  BUILD FAILED
  /master/hive/build.xml:148: The following error occurred while executing
 this line:
  /master/hive/build.xml:93: The following error occurred while executing
 this line:
  /master/hive/shims/build.xml:64: The following error occurred while
 executing this line:
  /master/hive/build-common.xml:172: impossible to resolve dependencies:
  

LZO Compression on trunk

2010-02-05 Thread Bennie Schut
I have a tab separated files I have loaded it with load data inpath 
then I do a


SET hive.exec.compress.output=true;
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
SET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
select distinct login_cldr_id as cldr_id from chatsessions_load;

Ended Job = job_201001151039_1641
OK
NULL
NULL
NULL
Time taken: 49.06 seconds

however if I start it without the set commands I get this:
Ended Job = job_201001151039_1642
OK
2283
Time taken: 45.308 seconds

Which is the correct result.

When I do a insert overwrite on a rcfile table it will actually 
compress the data correctly.

When I disable compression and query this new table the result is correct.
When I enable compression it's wrong again.
I see no errors in the logs.

Any idea's why this might happen?




RE: Hive Installation Problem

2010-02-05 Thread baburaj . S
No I don't have the variable defined. Any other things that I have to check. Is 
this happening because I am trying for Hadoop 0.20.1

Babu


From: Carl Steinbach [mailto:c...@cloudera.com]
Sent: Friday, February 05, 2010 3:07 PM
To: hive-user@hadoop.apache.org
Subject: Re: Hive Installation Problem

Hi Babu,

~/.ant/cache is the default Ivy cache directory for Hive, but if the 
environment variable IVY_HOME
is set it will use $IVY_HOME/cache instead. Is it possible that you have this 
environment
variable set to a value different than ~/.ant?
On Fri, Feb 5, 2010 at 12:09 AM, baburaj.S 
babura...@onmobile.commailto:babura...@onmobile.com wrote:
I have tried the same but still the installation is giving the same error. I 
don't know if it is looking in the cache . Can we make any change in 
ivysettings.xml that it has to resolve the file from the file system rather 
through an url.

Babu


-Original Message-
From: Zheng Shao [mailto:zsh...@gmail.commailto:zsh...@gmail.com]
Sent: Friday, February 05, 2010 12:47 PM
To: hive-user@hadoop.apache.orgmailto:hive-user@hadoop.apache.org
Subject: Re: Hive Installation Problem

Added to http://wiki.apache.org/hadoop/Hive/FAQ

Zheng

On Thu, Feb 4, 2010 at 11:11 PM, Zheng Shao 
zsh...@gmail.commailto:zsh...@gmail.com wrote:
 Try this:

 cd ~/.ant/cache/hadoop/core/sources
 wget 
 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz


 Zheng

 On Thu, Feb 4, 2010 at 10:23 PM, baburaj.S 
 babura...@onmobile.commailto:babura...@onmobile.com wrote:
 Hello ,

 I am new to Hadoop and is trying to install Hive now. We have the following 
 setup at our side

 OS - Ubuntu 9.10
 Hadoop - 0.20.1
 Hive installation tried - 0.4.0 .

 The Hadoop is installed and is working fine . Now when we were installing 
 Hive I got error that it couldn't resolve the dependencies. I changed the 
 shims build and properties xml to make the dependencies look for Hadoop 
 0.20.1 . But now when I call the ant script I get the following error

 ivy-retrieve-hadoop-source:
 [ivy:retrieve] :: Ivy 2.0.0-rc2 - 20081028224207 :: 
 http://ant.apache.org/ivy/ :
 :: loading settings :: file = /master/hive/ivy/ivysettings.xml
 [ivy:retrieve] :: resolving dependencies :: 
 org.apache.hadoop.hive#shims;working
 [ivy:retrieve]  confs: [default]
 [ivy:retrieve] :: resolution report :: resolve 953885ms :: artifacts dl 0ms
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
-
|  default |   1   |   0   |   0   |   0   ||   0   |   0   |
-
 [ivy:retrieve]
 [ivy:retrieve] :: problems summary ::
 [ivy:retrieve]  WARNINGS
 [ivy:retrieve]  module not found: hadoop#core;0.20.1
 [ivy:retrieve]   hadoop-source: tried
 [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
 [ivy:retrieve]
 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]   apache-snapshot: tried
 [ivy:retrieve]
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
 [ivy:retrieve]
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]   maven2: tried
 [ivy:retrieve]
 http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
 [ivy:retrieve]
 http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.tar.gz
 [ivy:retrieve]  ::
 [ivy:retrieve]  ::  UNRESOLVED DEPENDENCIES ::
 [ivy:retrieve]  ::
 [ivy:retrieve]  :: hadoop#core;0.20.1: not found
 [ivy:retrieve]  ::
 [ivy:retrieve]  ERRORS
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.tar.gz
 [ivy:retrieve]
 [ivy:retrieve] 

RE: Hive Installation Problem

2010-02-05 Thread Vidyasagar Venkata Nallapati
Hi ,

We are still getting the problem

[ivy:retrieve] no resolved descriptor found: launching default resolve
Overriding previous definition of property ivy.version
[ivy:retrieve] using ivy parser to parse file:/master/hadoop/hive/shims/ivy.xml
[ivy:retrieve] :: resolving dependencies :: 
org.apache.hadoop.hive#shims;work...@ph1
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  validate = true
[ivy:retrieve]  refresh = false
[ivy:retrieve] resolving dependencies for configuration 'default'
[ivy:retrieve] == resolving dependencies for 
org.apache.hadoop.hive#shims;work...@ph1 [default]
[ivy:retrieve] == resolving dependencies 
org.apache.hadoop.hive#shims;work...@ph1-hadoop#core;0.20.1 [default-*]
[ivy:retrieve] default: Checking cache for: dependency: hadoop#core;0.20.1 
{*=[*]}
[ivy:retrieve]  hadoop-source: no ivy file nor artifact found for 
hadoop#core;0.20.1
[ivy:retrieve]  tried 
https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/core-0.20.1.pom

And the .pom for this is not getting copied, please suggest something on this.

Regards
Vidyasagar N V

From: baburaj.S [mailto:babura...@onmobile.com]
Sent: Friday, February 05, 2010 4:59 PM
To: hive-user@hadoop.apache.org
Subject: RE: Hive Installation Problem

No I don't have the variable defined. Any other things that I have to check. Is 
this happening because I am trying for Hadoop 0.20.1

Babu


From: Carl Steinbach [mailto:c...@cloudera.com]
Sent: Friday, February 05, 2010 3:07 PM
To: hive-user@hadoop.apache.org
Subject: Re: Hive Installation Problem

Hi Babu,

~/.ant/cache is the default Ivy cache directory for Hive, but if the 
environment variable IVY_HOME
is set it will use $IVY_HOME/cache instead. Is it possible that you have this 
environment
variable set to a value different than ~/.ant?
On Fri, Feb 5, 2010 at 12:09 AM, baburaj.S 
babura...@onmobile.commailto:babura...@onmobile.com wrote:
I have tried the same but still the installation is giving the same error. I 
don't know if it is looking in the cache . Can we make any change in 
ivysettings.xml that it has to resolve the file from the file system rather 
through an url.

Babu


-Original Message-
From: Zheng Shao [mailto:zsh...@gmail.commailto:zsh...@gmail.com]
Sent: Friday, February 05, 2010 12:47 PM
To: hive-user@hadoop.apache.orgmailto:hive-user@hadoop.apache.org
Subject: Re: Hive Installation Problem

Added to http://wiki.apache.org/hadoop/Hive/FAQ

Zheng

On Thu, Feb 4, 2010 at 11:11 PM, Zheng Shao 
zsh...@gmail.commailto:zsh...@gmail.com wrote:
 Try this:

 cd ~/.ant/cache/hadoop/core/sources
 wget 
 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz


 Zheng

 On Thu, Feb 4, 2010 at 10:23 PM, baburaj.S 
 babura...@onmobile.commailto:babura...@onmobile.com wrote:
 Hello ,

 I am new to Hadoop and is trying to install Hive now. We have the following 
 setup at our side

 OS - Ubuntu 9.10
 Hadoop - 0.20.1
 Hive installation tried - 0.4.0 .

 The Hadoop is installed and is working fine . Now when we were installing 
 Hive I got error that it couldn't resolve the dependencies. I changed the 
 shims build and properties xml to make the dependencies look for Hadoop 
 0.20.1 . But now when I call the ant script I get the following error

 ivy-retrieve-hadoop-source:
 [ivy:retrieve] :: Ivy 2.0.0-rc2 - 20081028224207 :: 
 http://ant.apache.org/ivy/ :
 :: loading settings :: file = /master/hive/ivy/ivysettings.xml
 [ivy:retrieve] :: resolving dependencies :: 
 org.apache.hadoop.hive#shims;working
 [ivy:retrieve]  confs: [default]
 [ivy:retrieve] :: resolution report :: resolve 953885ms :: artifacts dl 0ms
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
-
|  default |   1   |   0   |   0   |   0   ||   0   |   0   |
-
 [ivy:retrieve]
 [ivy:retrieve] :: problems summary ::
 [ivy:retrieve]  WARNINGS
 [ivy:retrieve]  module not found: hadoop#core;0.20.1
 [ivy:retrieve]   hadoop-source: tried
 [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
 [ivy:retrieve]
 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]   apache-snapshot: tried
 [ivy:retrieve]
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
 [ivy:retrieve]
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]   maven2: tried
 [ivy:retrieve]
 http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]-- artifact 

Re: computing median and percentiles

2010-02-05 Thread Jerome Boulon
Hi Bryan,
I'm working on Hive-259. I'll post an update early next week.
/Jerome.


On 2/4/10 9:08 PM, Bryan Talbot btal...@aeriagames.com wrote:

 What's the best way to compute median and other percentiles using Hive 0.40?
 I've run across http://issues.apache.org/jira/browse/HIVE-259 but there
 doesn't seem to be any planned implementation yet.
 
 
 -Bryan
 
 
 
 
 



Re: LZO Compression on trunk

2010-02-05 Thread Zheng Shao
That seems to be a bug.
Are you using hive trunk or any release?


On 2/5/10, Bennie Schut bsc...@ebuddy.com wrote:
 I have a tab separated files I have loaded it with load data inpath
 then I do a

 SET hive.exec.compress.output=true;
 SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
 SET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
 select distinct login_cldr_id as cldr_id from chatsessions_load;

 Ended Job = job_201001151039_1641
 OK
 NULL
 NULL
 NULL
 Time taken: 49.06 seconds

 however if I start it without the set commands I get this:
 Ended Job = job_201001151039_1642
 OK
 2283
 Time taken: 45.308 seconds

 Which is the correct result.

 When I do a insert overwrite on a rcfile table it will actually
 compress the data correctly.
 When I disable compression and query this new table the result is correct.
 When I enable compression it's wrong again.
 I see no errors in the logs.

 Any idea's why this might happen?




-- 
Sent from my mobile device

Yours,
Zheng


Re: Hive Installation Problem

2010-02-05 Thread Zheng Shao
HI guys,

Can you have a try to make the following directory the same as mine?
Once this is done, remove the build directory, and run ant package.

Does this solve the problem?



[zs...@dev ~/.ant] ls -lR
.:
total 3896
drwxr-xr-x  2 zshao users4096 Feb  5 13:04 apache-ivy-2.0.0-rc2
-rw-r--r--  1 zshao users 3965953 Nov  4  2008 apache-ivy-2.0.0-rc2-bin.zip
-rw-r--r--  1 zshao users   0 Feb  5 13:04 apache-ivy-2.0.0-rc2.installed
drwxr-xr-x  3 zshao users4096 Feb  5 13:07 cache
drwxr-xr-x  2 zshao users4096 Feb  5 13:04 lib

./apache-ivy-2.0.0-rc2:
total 880
-rw-r--r--  1 zshao users 893199 Oct 28  2008 ivy-2.0.0-rc2.jar

./cache:
total 4
drwxr-xr-x  3 zshao users 4096 Feb  4 19:30 hadoop

./cache/hadoop:
total 4
drwxr-xr-x  3 zshao users 4096 Feb  5 13:08 core

./cache/hadoop/core:
total 4
drwxr-xr-x  2 zshao users 4096 Feb  4 19:30 sources

./cache/hadoop/core/sources:
total 127436
-rw-r--r--  1 zshao users 14427013 Aug 20  2008 hadoop-0.17.2.1.tar.gz
-rw-r--r--  1 zshao users 30705253 Jan 22  2009 hadoop-0.18.3.tar.gz
-rw-r--r--  1 zshao users 42266180 Nov 13  2008 hadoop-0.19.0.tar.gz
-rw-r--r--  1 zshao users 42813980 Apr  8  2009 hadoop-0.20.0.tar.gz

./lib:
total 880
-rw-r--r--  1 zshao users 893199 Feb  5 13:04 ivy-2.0.0-rc2.jar


Zheng

On Fri, Feb 5, 2010 at 5:49 AM, Vidyasagar Venkata Nallapati
vidyasagar.nallap...@onmobile.com wrote:
 Hi ,



 We are still getting the problem



 [ivy:retrieve] no resolved descriptor found: launching default resolve

 Overriding previous definition of property ivy.version

 [ivy:retrieve] using ivy parser to parse
 file:/master/hadoop/hive/shims/ivy.xml

 [ivy:retrieve] :: resolving dependencies ::
 org.apache.hadoop.hive#shims;work...@ph1

 [ivy:retrieve]  confs: [default]

 [ivy:retrieve]  validate = true

 [ivy:retrieve]  refresh = false

 [ivy:retrieve] resolving dependencies for configuration 'default'

 [ivy:retrieve] == resolving dependencies for
 org.apache.hadoop.hive#shims;work...@ph1 [default]

 [ivy:retrieve] == resolving dependencies
 org.apache.hadoop.hive#shims;work...@ph1-hadoop#core;0.20.1 [default-*]

 [ivy:retrieve] default: Checking cache for: dependency: hadoop#core;0.20.1
 {*=[*]}

 [ivy:retrieve]  hadoop-source: no ivy file nor artifact found for
 hadoop#core;0.20.1

 [ivy:retrieve]  tried
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/core-0.20.1.pom



 And the .pom for this is not getting copied, please suggest something on
 this.



 Regards

 Vidyasagar N V



 From: baburaj.S [mailto:babura...@onmobile.com]
 Sent: Friday, February 05, 2010 4:59 PM

 To: hive-user@hadoop.apache.org
 Subject: RE: Hive Installation Problem



 No I don’t have the variable defined. Any other things that I have to check.
 Is this happening because I am trying for Hadoop 0.20.1



 Babu





 From: Carl Steinbach [mailto:c...@cloudera.com]
 Sent: Friday, February 05, 2010 3:07 PM
 To: hive-user@hadoop.apache.org
 Subject: Re: Hive Installation Problem



 Hi Babu,

 ~/.ant/cache is the default Ivy cache directory for Hive, but if the
 environment variable IVY_HOME
 is set it will use $IVY_HOME/cache instead. Is it possible that you have
 this environment
 variable set to a value different than ~/.ant?

 On Fri, Feb 5, 2010 at 12:09 AM, baburaj.S babura...@onmobile.com wrote:

 I have tried the same but still the installation is giving the same error. I
 don't know if it is looking in the cache . Can we make any change in
 ivysettings.xml that it has to resolve the file from the file system rather
 through an url.

 Babu

 -Original Message-
 From: Zheng Shao [mailto:zsh...@gmail.com]
 Sent: Friday, February 05, 2010 12:47 PM
 To: hive-user@hadoop.apache.org
 Subject: Re: Hive Installation Problem

 Added to http://wiki.apache.org/hadoop/Hive/FAQ

 Zheng

 On Thu, Feb 4, 2010 at 11:11 PM, Zheng Shao zsh...@gmail.com wrote:
 Try this:

 cd ~/.ant/cache/hadoop/core/sources
 wget
 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz


 Zheng

 On Thu, Feb 4, 2010 at 10:23 PM, baburaj.S babura...@onmobile.com wrote:
 Hello ,

 I am new to Hadoop and is trying to install Hive now. We have the
 following setup at our side

 OS - Ubuntu 9.10
 Hadoop - 0.20.1
 Hive installation tried - 0.4.0 .

 The Hadoop is installed and is working fine . Now when we were installing
 Hive I got error that it couldn't resolve the dependencies. I changed the
 shims build and properties xml to make the dependencies look for Hadoop
 0.20.1 . But now when I call the ant script I get the following error

 ivy-retrieve-hadoop-source:
 [ivy:retrieve] :: Ivy 2.0.0-rc2 - 20081028224207 ::
 http://ant.apache.org/ivy/ :
 :: loading settings :: file = /master/hive/ivy/ivysettings.xml
 [ivy:retrieve] :: resolving dependencies ::
 org.apache.hadoop.hive#shims;working
 [ivy:retrieve]  confs: [default]
 [ivy:retrieve] :: resolution report :: resolve 953885ms :: artifacts dl
 0ms

  

heads up on ivy upgrade

2010-02-05 Thread John Sichi
Hi all,

Zheng just committed my patch for HIVE-1120, which upgrades ivy from 2.0 to 2.1 
and refines the new offline mode for the Hive build.

After updating your sandbox with this patch, you'll need to delete your 
IVY_HOME directory (typically ~/.ant unless you have set it explicitly), 
otherwise you'll get errors the next time you try to run ant package.  See 
JIRA for an example of the error message.

Unfortunately, this will mean that ivy will have to re-download the big hadoop 
dependencies on your next build, and as a number of people have reported 
recently, this seems to be a little flaky currently.  This patch won't improve 
that situation (since from what I've seen the flakiness comes from the source 
repositories), but it shouldn't make it any worse, and once you get it and 
successfully re-download, you should be able to add ANT_ARGS=-Doffline=true 
to your shell environment and successfully work disconnected after that.

For the flakiness, I'm going to take a look at ivysettings.xml to see if we can 
improve the repository situation via mirroring.

JVS



Re: LZO Compression on trunk

2010-02-05 Thread Yongqiang He
Hi Bennie,
Can you post your hadoop version and hive version?

Thanks
Yongqiang


On 2/5/10 10:05 AM, Zheng Shao zsh...@gmail.com wrote:

 That seems to be a bug.
 Are you using hive trunk or any release?
 
 
 On 2/5/10, Bennie Schut bsc...@ebuddy.com wrote:
 I have a tab separated files I have loaded it with load data inpath
 then I do a
 
 SET hive.exec.compress.output=true;
 SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
 SET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
 select distinct login_cldr_id as cldr_id from chatsessions_load;
 
 Ended Job = job_201001151039_1641
 OK
 NULL
 NULL
 NULL
 Time taken: 49.06 seconds
 
 however if I start it without the set commands I get this:
 Ended Job = job_201001151039_1642
 OK
 2283
 Time taken: 45.308 seconds
 
 Which is the correct result.
 
 When I do a insert overwrite on a rcfile table it will actually
 compress the data correctly.
 When I disable compression and query this new table the result is correct.
 When I enable compression it's wrong again.
 I see no errors in the logs.
 
 Any idea's why this might happen?
 
 
 




Re: LZO Compression on trunk

2010-02-05 Thread Bennie Schut
Hadoop 0.20.1 and hive trunk from this week. Monday I'll try and use an 
older version of hive to see if that helps. Perhaps also gz to see if 
it's compression in general.


Yongqiang He wrote:

Hi Bennie,
Can you post your hadoop version and hive version?

Thanks
Yongqiang


On 2/5/10 10:05 AM, Zheng Shao zsh...@gmail.com wrote:

  

That seems to be a bug.
Are you using hive trunk or any release?


On 2/5/10, Bennie Schut bsc...@ebuddy.com wrote:


I have a tab separated files I have loaded it with load data inpath
then I do a

SET hive.exec.compress.output=true;
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
SET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
select distinct login_cldr_id as cldr_id from chatsessions_load;

Ended Job = job_201001151039_1641
OK
NULL
NULL
NULL
Time taken: 49.06 seconds

however if I start it without the set commands I get this:
Ended Job = job_201001151039_1642
OK
2283
Time taken: 45.308 seconds

Which is the correct result.

When I do a insert overwrite on a rcfile table it will actually
compress the data correctly.
When I disable compression and query this new table the result is correct.
When I enable compression it's wrong again.
I see no errors in the logs.

Any idea's why this might happen?



  



  




Re: Hive Installation Problem

2010-02-05 Thread John Sichi
By the way, the current IVY_HOME detection in build-common.xml is broken 
because it doesn't do:

property environment=env/

first.

I'll log a JIRA issue for it, but it seems there are other problems with it 
even after I fix that since the build is currenlty installing ivy under 
build/ivy rather than under ${ivy.home}; nothing else in build-common.xml 
references ivy.home.

JVS

On Feb 5, 2010, at 1:15 PM, Zheng Shao wrote:

 HI guys,

 Can you have a try to make the following directory the same as mine?
 Once this is done, remove the build directory, and run ant package.

 Does this solve the problem?



 [zs...@dev ~/.ant] ls -lR
 .:
 total 3896
 drwxr-xr-x  2 zshao users4096 Feb  5 13:04 apache-ivy-2.0.0-rc2
 -rw-r--r--  1 zshao users 3965953 Nov  4  2008 apache-ivy-2.0.0-rc2-bin.zip
 -rw-r--r--  1 zshao users   0 Feb  5 13:04 apache-ivy-2.0.0-rc2.installed
 drwxr-xr-x  3 zshao users4096 Feb  5 13:07 cache
 drwxr-xr-x  2 zshao users4096 Feb  5 13:04 lib

 ./apache-ivy-2.0.0-rc2:
 total 880
 -rw-r--r--  1 zshao users 893199 Oct 28  2008 ivy-2.0.0-rc2.jar

 ./cache:
 total 4
 drwxr-xr-x  3 zshao users 4096 Feb  4 19:30 hadoop

 ./cache/hadoop:
 total 4
 drwxr-xr-x  3 zshao users 4096 Feb  5 13:08 core

 ./cache/hadoop/core:
 total 4
 drwxr-xr-x  2 zshao users 4096 Feb  4 19:30 sources

 ./cache/hadoop/core/sources:
 total 127436
 -rw-r--r--  1 zshao users 14427013 Aug 20  2008 hadoop-0.17.2.1.tar.gz
 -rw-r--r--  1 zshao users 30705253 Jan 22  2009 hadoop-0.18.3.tar.gz
 -rw-r--r--  1 zshao users 42266180 Nov 13  2008 hadoop-0.19.0.tar.gz
 -rw-r--r--  1 zshao users 42813980 Apr  8  2009 hadoop-0.20.0.tar.gz

 ./lib:
 total 880
 -rw-r--r--  1 zshao users 893199 Feb  5 13:04 ivy-2.0.0-rc2.jar


 Zheng

 On Fri, Feb 5, 2010 at 5:49 AM, Vidyasagar Venkata Nallapati
 vidyasagar.nallap...@onmobile.com wrote:
 Hi ,



 We are still getting the problem



 [ivy:retrieve] no resolved descriptor found: launching default resolve

 Overriding previous definition of property ivy.version

 [ivy:retrieve] using ivy parser to parse
 file:/master/hadoop/hive/shims/ivy.xml

 [ivy:retrieve] :: resolving dependencies ::
 org.apache.hadoop.hive#shims;work...@ph1

 [ivy:retrieve]  confs: [default]

 [ivy:retrieve]  validate = true

 [ivy:retrieve]  refresh = false

 [ivy:retrieve] resolving dependencies for configuration 'default'

 [ivy:retrieve] == resolving dependencies for
 org.apache.hadoop.hive#shims;work...@ph1 [default]

 [ivy:retrieve] == resolving dependencies
 org.apache.hadoop.hive#shims;work...@ph1-hadoop#core;0.20.1 [default-*]

 [ivy:retrieve] default: Checking cache for: dependency: hadoop#core;0.20.1
 {*=[*]}

 [ivy:retrieve]  hadoop-source: no ivy file nor artifact found for
 hadoop#core;0.20.1

 [ivy:retrieve]  tried
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/core-0.20.1.pom



 And the .pom for this is not getting copied, please suggest something on
 this.



 Regards

 Vidyasagar N V



 From: baburaj.S [mailto:babura...@onmobile.com]
 Sent: Friday, February 05, 2010 4:59 PM

 To: hive-user@hadoop.apache.org
 Subject: RE: Hive Installation Problem



 No I don’t have the variable defined. Any other things that I have to check.
 Is this happening because I am trying for Hadoop 0.20.1



 Babu





 From: Carl Steinbach [mailto:c...@cloudera.com]
 Sent: Friday, February 05, 2010 3:07 PM
 To: hive-user@hadoop.apache.org
 Subject: Re: Hive Installation Problem



 Hi Babu,

 ~/.ant/cache is the default Ivy cache directory for Hive, but if the
 environment variable IVY_HOME
 is set it will use $IVY_HOME/cache instead. Is it possible that you have
 this environment
 variable set to a value different than ~/.ant?

 On Fri, Feb 5, 2010 at 12:09 AM, baburaj.S babura...@onmobile.com wrote:

 I have tried the same but still the installation is giving the same error. I
 don't know if it is looking in the cache . Can we make any change in
 ivysettings.xml that it has to resolve the file from the file system rather
 through an url.

 Babu

 -Original Message-
 From: Zheng Shao [mailto:zsh...@gmail.com]
 Sent: Friday, February 05, 2010 12:47 PM
 To: hive-user@hadoop.apache.org
 Subject: Re: Hive Installation Problem

 Added to http://wiki.apache.org/hadoop/Hive/FAQ

 Zheng

 On Thu, Feb 4, 2010 at 11:11 PM, Zheng Shao zsh...@gmail.com wrote:
 Try this:

 cd ~/.ant/cache/hadoop/core/sources
 wget
 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz


 Zheng

 On Thu, Feb 4, 2010 at 10:23 PM, baburaj.S babura...@onmobile.com wrote:
 Hello ,

 I am new to Hadoop and is trying to install Hive now. We have the
 following setup at our side

 OS - Ubuntu 9.10
 Hadoop - 0.20.1
 Hive installation tried - 0.4.0 .

 The Hadoop is installed and is working fine . Now when we were installing
 Hive I got error that it couldn't resolve the dependencies. I changed the
 shims build and properties xml to make the dependencies look for Hadoop