Re: Is there a mechanism similar to hadoop -archive in hive (add archive is not apparently)

Stephen Sprague Thu, 20 Jun 2013 07:51:57 -0700

what would be interesting would be to run a little experiment and find out
what the default PATH is on your data nodes.  How much of a pain would it
be to run a little python script to print to stderr the value of the
environmental variable $PATH and $PWD (or the shell command 'pwd') ?


that's of course going through normal channels of "add file".

the thing is given you're using a relative path "hive/parse_qx.py"  you
need to know what the "current directory" is when the process runs on the
data nodes.




On Thu, Jun 20, 2013 at 5:32 AM, Stephen Boesch <java...@gmail.com> wrote:

>
> We have a few dozen files that need to be made available to all
> mappers/reducers in the cluster while running  hive transformation steps .
>
> It seems the "add archive"  does not make the entries unarchived and thus
> available directly on the default file path - and that is what we are
> looking for.
>
> To illustrate:
>
>    add file modelfile.1;
>    add file modelfile.2;
>    ..
>     add file modelfile.N;
>
>   Then, our model that is invoked during the transformation step *does *have
> correct access to its model files in the defaul path.
>
> But .. those model files take low *minutes* to all load..
>
> instead when we try:
>    add archive  modelArchive.tgz.
>
> The problem is the archive does not get exploded apparently ..
>
> I have an archive for example that contains shell scripts under the "hive"
> directory stored inside.  I am *not *able to access
> hive/my-shell-script.sh  after adding the archive. Specifically the
> following fails:
>
> $ tar -tvf appm*.tar.gz | grep launch-quixey_to_xml
> -rwxrwxr-x stephenb/stephenb    664 2013-06-18 17:46
> appminer/bin/launch-quixey_to_xml.sh
>
> from (select transform (aappname,qappname)
> *using *'*hive/parse_qx.py*' as (aappname2 string, qappname2 string) from
> eqx ) o insert overwrite table c select o.aappname2, o.qappname2;
>
> Cannot run program "hive/parse_qx.py": java.io.IOException: error=2, No such 
> file or directory
>
>
>
>

Re: Is there a mechanism similar to hadoop -archive in hive (add archive is not apparently)

Reply via email to