Many thanks Jakob. 

So it basically boils down to this demarcation as suggested which looks
clearer 

val errlog = sc.textFile("/unix_files/*.ksh")
errlog.filter(line => line.contains("sed")).collect().foreach(line =>
println(line)) 

Regards, 

Mich 

On 10/02/2016 23:21, Jakob Odersky wrote: 

> Hi Mich,
> your assumptions 1 to 3 are all correct (nitpick: they're method
> *calls*, the methods being the part before the parentheses, but I
> assume that's what you meant). The last one is also a method call but
> uses syntactic sugar on top: `foreach(println)` boils down to
> `foreach(line => println(line))`.
> 
> On an unrelated side-note, I would suggest you add a period between
> every method call, it makes things easier to read and is actually
> required in certain circumstances. Specifically I would add a period
> before collect() and foreach().
> 
> best,
> --Jakob
> 
> On Wed, Feb 10, 2016 at 2:35 PM, Mich Talebzadeh
> <mich.talebza...@cloudtechnologypartners.co.uk> wrote:
> Hi Chandeep Many thanks for your help In the line below errlog.filter(line => 
> line.contains("sed"))collect()foreach(println) Can you please clarify the 
> components with the correct naming as I am new to Scala errlog --> is the 
> RDD? filter(line => line.contains("sed")) is a method collect() is another 
> method ? foreach (println) ? Thanks On 10/02/2016 21:28, Chandeep Singh 
> wrote: Hi Mich, If you would like to print everything to the console you 
> could - errlog.filter(line => line.contains("sed"))collect()foreach(println) 
> or you could always save to a file using any of the saveAs methods. Thanks, 
> Chandeep On Wed, Feb 10, 2016 at 8:14 PM, 
> <mich.talebza...@cloudtechnologypartners.co.uk> wrote: Hi, I have a bunch of 
> files stored in hdfs /unit_files directory in total 319 files scala> val 
> errlog = sc.textFile("/unix_files/*.ksh") scala> errlog.filter(line => 
> line.contains("sed"))count() res104: Long = 1113 So it returns 1113 instances 
> the word "sed" If I want to see the collection I can do
scala> errlog.filter(line => line.contains("sed"))collect() res105: 
Array[String] = Array(" DSQUERY=${1} ; DBNAME=${2} ; ERROR=0 ; 
PROGNAME=$(basename $0 | sed -e s/.ksh//)", # . in environment based on 
argument for script., " exec sp_spaceused", " exec sp_spaceused", 
PROGNAME=$(basename $0 | sed -e s/.ksh//), " BACKUPSERVER=$5 # Server that is 
used to load the transaction dump", " BACKUPSERVER=$5 # Server that is used to 
load the transaction dump", " BACKUPSERVER=$5 # Server that is used to load the 
transaction dump", " cat $TMPDIR/${DBNAME}_trandump.sql | sed 
s/${DSQUERY}/${REMOTESERVER}/ > $TMPDIR/${DBNAME}_trandump.tmpsql", cat 
$TMPDIR/${DBNAME}_tran_transfer.sql | sed s/${DSQUERY}/${REMOTESERVER}/ > 
$TMPDIR/${DBNAME}_tran_transfer.tmpsql, PROGNAME=$(basename $0 | sed -e 
s/.ksh//), " B... scala> Now is there anyway I can retrieve all these instances 
or perhaps they are all wrapped up and I only see few lines? Thanks, Mich -- Dr 
Mich Talebzadeh LinkedIn
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
 [1] http://talebzadehmich.wordpress.com [2] NOTE: The information in this 
email is proprietary and confidential. This message is for the designated 
recipient only, if you are not the intended recipient, you should destroy it 
immediately. Any information in this message shall not be understood as given 
or endorsed by Cloud Technology Partners Ltd, its subsidiaries or their 
employees, unless expressly so stated. It is the responsibility of the 
recipient to ensure that this email is virus free, therefore neither Cloud 
Technology partners Ltd, its subsidiaries nor their employees accept any 
responsibility.

-- 

Dr Mich Talebzadeh

LinkedIn
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

http://talebzadehmich.wordpress.com

NOTE: The information in this email is proprietary and confidential.
This message is for the designated recipient only, if you are not the
intended recipient, you should destroy it immediately. Any information
in this message shall not be understood as given or endorsed by Cloud
Technology Partners Ltd, its subsidiaries or their employees, unless
expressly so stated. It is the responsibility of the recipient to ensure
that this email is virus free, therefore neither Cloud Technology
partners Ltd, its subsidiaries nor their employees accept any
responsibility.

 

Links:
------
[1]
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
[2] http://talebzadehmich.wordpress.com

Reply via email to