Ah yes, sorry about that. We had a problem with HOD not working well with 
piped inputs and outputs, so we actually use an expect script to interface to 
hod. (We should open an issue on this.)

I'm attaching the script that we use.

ben

On Wednesday 28 November 2007 11:38:09 Craig Macdonald wrote:
> Hi all,
>
> I've been trying to setup Pig using Hadoop on Demand. Using some
> hackery, my incantation now looks like
>
> PATH=/users/tr.craigm/OF_tools/python/bin/:$PATH ROOT=$PWD
> scripts/pig.pl -Dlog4j.level=debug -Dhod.server=local
> -Dhod.expect.root=$PWD -Dhod.command=hod/bin/hod
> -Dhod.expect.uselatest=hodrc/released -Dyinst.cluster=
> -Dhadoop.root.logger=DEBUG,console  --cluster hodrc
>
> (the name of my hodrc file is hodrc).
>
> However, the HOD connection code in PigContext mystifies me. Does it
> correspond to any released version of HOD?
> It seems to connect to HOD, and parse the response.
>
> PIG-18 (https://issues.apache.org/jira/browse/PIG-18) states that Pig
> needs to be fixed to work with hod 4.
> So I presume that Pig does not worth with the HOD version
> hod-open-4.tar.gz  attached to
> https://issues.apache.org/jira/browse/HADOOP-1301
>
> However, it doesnt look like Pig works with the other version of Hod
> attached to the same JIRA issue: hod.0.2.2.tar.gz
>
> PigContent.java looks for output from HOD in the form of lines starting:
> hdfsUI:
> hdfs:
> mapredUI:
> mapred:
> hadoopConf:
>
> I cant find any source in either versions of HOD that resemble this.
> Does anyone know if Pig will currently work with any currently openly
> available version of HOD?
>
> Thanks in advance
>
> Craig


#!/usr/bin/expect
#
# This is a wretched expect script to startup HOD and scrap the necessary
# information we need to run Pig. Tragically, we can't just pipe HOD's output
# into a script, so we have to use expect. Also the real information we need
# is not given to us on stdout; rather, we get the name of the configuration
# file with the information we need on stdout. We have to write actual TCL to
# parse the file.
#

#
# Quick and dirty parser to extract the value of mapred.job.tracker
#

trap handleExit {SIGINT SIGTERM SIGHUP SIGABRT SIGPIPE}

proc handleExit {} {
        send "exit\n"
        set timeout 20
        expect "do not CTL-C"
        puts "Exiting"
        exit
}

proc extractMapRedHostPort {file} {
        set fh [open $file r]
        set line [read $fh]
        close $fh
        regexp {>mapred.job.tracker</name>[^<]*<value>([^<]*)</value>} $line 
match sub
        return $sub
}

#
# Quick and dirty parser to extract the value of fs.default.name
#
proc extractDFSHostPort {file} {
        set fh [open $file r]
        set line [read $fh]
        close $fh
        regexp {>fs.default.name</name>[^<]*<value>([^<]*)</value>} $line match 
sub
        return $sub
}

set mOpt {"-m" "15"}
foreach i $argv {
        if {$i == "-m"} {
                set mOpt {};
        }
}

log_user 0
set timeout -1
set args [concat $argv $mOpt]
spawn -ignore {SIGHUP} hod -n [lindex $args 0 ] [lindex $args 1] [lindex $args 
2] [lindex $args 3] [lindex $args 4] [lindex $args 5] [lindex $args 6 ] [lindex 
$args 7] [lindex $args 8] [lindex $args 9] [lindex $args 10]

expect "HDFS UI on "
expect "\n"
puts -nonewline "hdfsUI: $expect_out(buffer)"

expect "Mapred UI on "
expect "\n"
puts -nonewline "mapredUI: $expect_out(buffer)"

expect "Hadoop config file in: "
expect "\n"
puts -nonewline "hadoopConf: $expect_out(buffer)"

puts "hdfs: [extractDFSHostPort [string trim $expect_out(buffer)]]\r"
puts "mapred: [extractMapRedHostPort [string trim $expect_out(buffer)]]\r"

#
# Now just wait forever. Eventually we will be ruthlessly killed.
#
expect_user {
        eof { handleExit }
        timeout {exp_continue}
}

Reply via email to