Hi Mich, If you would like to print everything to the console you could - errlog. filter(line => line.contains("sed"))collect()foreach(println)
or you could always save to a file using any of the saveAs methods. Thanks, Chandeep On Wed, Feb 10, 2016 at 8:14 PM, < mich.talebza...@cloudtechnologypartners.co.uk> wrote: > > > Hi, > > I have a bunch of files stored in hdfs /unit_files directory in total 319 > files > scala> val errlog = sc.textFile("/unix_files/*.ksh") > > scala> errlog.filter(line => line.contains("sed"))count() > res104: Long = 1113 > So it returns 1113 instances the word "sed" > > If I want to see the collection I can do > > > *scala> errlog.filter(line => line.contains("sed"))collect()* > > res105: Array[String] = Array(" DSQUERY=${1} ; > DBNAME=${2} ; ERROR=0 ; PROGNAME=$(basename $0 | sed -e s/.ksh//)", # . in > environment based on argument for script., " exec sp_spaceused", " > exec sp_spaceused", PROGNAME=$(basename $0 | sed -e s/.ksh//), " > BACKUPSERVER=$5 # Server that is used to load the transaction dump", " > BACKUPSERVER=$5 # Server that is used to load the transaction > dump", " BACKUPSERVER=$5 # Server that is used to load the > transaction dump", " cat $TMPDIR/${DBNAME}_trandump.sql | sed > s/${DSQUERY}/${REMOTESERVER}/ > $TMPDIR/${DBNAME}_trandump.tmpsql", cat > $TMPDIR/${DBNAME}_tran_transfer.sql | sed s/${DSQUERY}/${REMOTESERVER}/ > > $TMPDIR/${DBNAME}_tran_transfer.tmpsql, PROGNAME=$(basename $0 | sed -e > s/.ksh//), " B... > scala> > > > Now is there anyway I can retrieve all these instances or perhaps they are > all wrapped up and I only see few lines? > > Thanks, > > Mich > >