Hi,
I have a bunch of files stored in hdfs /unit_files directory in total
319 files
scala> val errlog = sc.textFile("/unix_files/*.ksh")
scala> errlog.filter(line => line.contains("sed"))count()
res104: Long = 1113
So it returns 1113 instances the word "sed"
If I want to see the collection I can do
SCALA> ERRLOG.FILTER(LINE => LINE.CONTAINS("SED"))COLLECT()
res105: Array[String] = Array(" DSQUERY=${1} ; DBNAME=${2} ; ERROR=0 ;
PROGNAME=$(basename $0 | sed -e s/.ksh//)", # . in environment based on
argument for script., " exec sp_spaceused", " exec sp_spaceused",
PROGNAME=$(basename $0 | sed -e s/.ksh//), " BACKUPSERVER=$5 # Server
that is used to load the transaction dump", " BACKUPSERVER=$5 # Server
that is used to load the transaction dump", " BACKUPSERVER=$5 # Server
that is used to load the transaction dump", " cat
$TMPDIR/${DBNAME}_trandump.sql | sed s/${DSQUERY}/${REMOTESERVER}/ >
$TMPDIR/${DBNAME}_trandump.tmpsql", cat
$TMPDIR/${DBNAME}_tran_transfer.sql | sed s/${DSQUERY}/${REMOTESERVER}/
> $TMPDIR/${DBNAME}_tran_transfer.tmpsql, PROGNAME=$(basename $0 | sed -e
> s/.ksh//), " B...
scala>
Now is there anyway I can retrieve all these instances or perhaps they
are all wrapped up and I only see few lines?
Thanks,
Mich