Re: Question on Spark code

2017-07-23 Thread Reynold Xin
This is a standard practice used for chaining, to support

a.setStepSize(..)
  .set setRegParam(...)


On Sun, Jul 23, 2017 at 8:47 PM, tao zhan  wrote:

> Thank you for replying.
> But I do not get it completely, why does the "this.type“” necessary?
> why could not it be like:
>
> def setStepSize(step: Double): Unit = {
> require(step > 0,
>   s"Initial step size must be positive but got ${step}")
> this.stepSize = step
> }
>
> On Mon, Jul 24, 2017 at 11:29 AM, M. Muvaffak ONUŞ <
> onus.muvaf...@gmail.com> wrote:
>
>> Doesn't it mean the return type will be type of "this" class. So, it
>> doesn't have to be this instance of the class but it has to be type of this
>> instance of the class. When you have a stack of inheritance and call that
>> function, it will return the same type with the level that you called it.
>>
>> On Sun, Jul 23, 2017 at 8:20 PM Reynold Xin  wrote:
>>
>>> It means the same object ("this") is returned.
>>>
>>> On Sun, Jul 23, 2017 at 8:16 PM, tao zhan  wrote:
>>>
 Hello,

 I am new to scala and spark.
 What does the "this.type" in set function for?


 ​
 https://github.com/apache/spark/blob/481f0792944d9a77f0fe8b5
 e2596da1d600b9d0a/mllib/src/main/scala/org/apache/spark/
 mllib/optimization/GradientDescent.scala#L48

 Thanks!

 Zhan

>>>
>>>
>


Re: Question on Spark code

2017-07-23 Thread Reynold Xin
It means the same object ("this") is returned.

On Sun, Jul 23, 2017 at 8:16 PM, tao zhan  wrote:

> Hello,
>
> I am new to scala and spark.
> What does the "this.type" in set function for?
>
>
> ​
> https://github.com/apache/spark/blob/481f0792944d9a77f0fe8b5e2596da
> 1d600b9d0a/mllib/src/main/scala/org/apache/spark/mllib/
> optimization/GradientDescent.scala#L48
>
> Thanks!
>
> Zhan
>


Re: Question on Spark code

2017-06-26 Thread Steve Loughran

On 25 Jun 2017, at 20:57, kant kodali 
> wrote:

impressive! I need to learn more about scala.

What I mean stripping away conditional check in Java is this.

static final boolean isLogInfoEnabled = false;

public void logMessage(String message) {
if(isLogInfoEnabled) {
log.info(message)
}
}

If you look at the byte code the dead if check will be removed.





Generally it's skipped in Java too now people move to SLF4J APIs, which does 
on-demand string expansion

LOG.info("network IO failure from {}  source to {}", src, 
dest, ex). That only builds the final string callis src.toString() and 
dest.toString() when needed; handling null values too. So you can skip those 
guards everywhere. But the string template is still constructed; it's not free, 
and there's some merit in maintaining the guard @ debug level, though I don't 
personally bother.

The spark one takes a closure, so it can do much more. However, you shouldn't 
do anything with side effects, or indeed, anything prone to throwing 
exceptions. Always try to write .toString() methods which are robust against 
null values, that is: valid for the entire life of an instance. Your debuggers 
will appreciate it too.



Re: Question on Spark code

2017-06-25 Thread kant kodali
impressive! I need to learn more about scala.

What I mean stripping away conditional check in Java is this.

static final boolean isLogInfoEnabled = false;

public void logMessage(String message) {
if(isLogInfoEnabled) {
log.info(message)
}
}

If you look at the byte code the dead if check will be removed.








On Sun, Jun 25, 2017 at 12:46 PM, Sean Owen  wrote:

> I think it's more precise to say args like any expression are evaluated
> when their value is required. It's just that this special syntax causes
> extra code to be generated that makes it effectively a function passed, not
> value, and one that's lazily evaluated. Look at the bytecode if you're
> curious.
>
> An if conditional is pretty trivial to evaluate here. I don't think that
> guidance is sound. The point is that it's not worth complicating the caller
> code in almost all cases by checking the guard condition manually.
>
> I'm not sure what you're referring to, but no, no compiler can elide these
> conditions. They're based on runtime values that can change at runtime.
>
> scala has an @elidable annotation which you can use to indicate to the
> compiler that the declaration should be entirely omitted when configured to
> elide above a certain detail level. This is how scalac elides assertions if
> you ask it to and you can do it to your own code. But this is something
> different, not what's happening here, and a fairly niche/advanced feature.
>
>
> On Sun, Jun 25, 2017 at 8:25 PM kant kodali  wrote:
>
>> @Sean Got it! I come from Java world so I guess I was wrong in assuming
>> that arguments are evaluated during the method invocation time. How about
>> the conditional checks to see if the log is InfoEnabled or DebugEnabled?
>> For Example,
>>
>> if (log.isInfoEnabled) log.info(msg)
>>
>> I hear we should use guard condition only when the string "msg"
>> construction is expensive otherwise we will be taking a performance hit
>> because of the additional "if" check unless the "log" itself is declared
>> static final and scala compiler will strip away the "if" check and produce
>> efficient byte code. Also log.info does the log.isInfoEnabled check
>> inside the body prior to logging the messages.
>>
>> https://github.com/qos-ch/slf4j/blob/master/slf4j-
>> simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L509
>> https://github.com/qos-ch/slf4j/blob/master/slf4j-
>> simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L599
>>
>> Please correct me if I am wrong.
>>
>>
>>
>>
>> On Sun, Jun 25, 2017 at 3:04 AM, Sean Owen  wrote:
>>
>>> Maybe you are looking for declarations like this. "=> String" means the
>>> arg isn't evaluated until it's used, which is just what you want with log
>>> statements. The message isn't constructed unless it will be logged.
>>>
>>> protected def logInfo(msg: => String) {
>>>
>>>
>>> On Sun, Jun 25, 2017 at 10:28 AM kant kodali  wrote:
>>>
 Hi All,

 I came across this file https://github.com/
 apache/spark/blob/master/core/src/main/scala/org/apache/
 spark/internal/Logging.scala and I am wondering what is the purpose of
 this? Especially it doesn't prevent any string concatenation and also the
 if checks are already done by the library itself right?

 Thanks!


>>


Re: Question on Spark code

2017-06-25 Thread Sean Owen
I think it's more precise to say args like any expression are evaluated
when their value is required. It's just that this special syntax causes
extra code to be generated that makes it effectively a function passed, not
value, and one that's lazily evaluated. Look at the bytecode if you're
curious.

An if conditional is pretty trivial to evaluate here. I don't think that
guidance is sound. The point is that it's not worth complicating the caller
code in almost all cases by checking the guard condition manually.

I'm not sure what you're referring to, but no, no compiler can elide these
conditions. They're based on runtime values that can change at runtime.

scala has an @elidable annotation which you can use to indicate to the
compiler that the declaration should be entirely omitted when configured to
elide above a certain detail level. This is how scalac elides assertions if
you ask it to and you can do it to your own code. But this is something
different, not what's happening here, and a fairly niche/advanced feature.


On Sun, Jun 25, 2017 at 8:25 PM kant kodali  wrote:

> @Sean Got it! I come from Java world so I guess I was wrong in assuming
> that arguments are evaluated during the method invocation time. How about
> the conditional checks to see if the log is InfoEnabled or DebugEnabled?
> For Example,
>
> if (log.isInfoEnabled) log.info(msg)
>
> I hear we should use guard condition only when the string "msg"
> construction is expensive otherwise we will be taking a performance hit
> because of the additional "if" check unless the "log" itself is declared
> static final and scala compiler will strip away the "if" check and produce
> efficient byte code. Also log.info does the log.isInfoEnabled check
> inside the body prior to logging the messages.
>
>
> https://github.com/qos-ch/slf4j/blob/master/slf4j-simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L509
>
> https://github.com/qos-ch/slf4j/blob/master/slf4j-simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L599
>
> Please correct me if I am wrong.
>
>
>
>
> On Sun, Jun 25, 2017 at 3:04 AM, Sean Owen  wrote:
>
>> Maybe you are looking for declarations like this. "=> String" means the
>> arg isn't evaluated until it's used, which is just what you want with log
>> statements. The message isn't constructed unless it will be logged.
>>
>> protected def logInfo(msg: => String) {
>>
>>
>> On Sun, Jun 25, 2017 at 10:28 AM kant kodali  wrote:
>>
>>> Hi All,
>>>
>>> I came across this file
>>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/Logging.scala
>>> and I am wondering what is the purpose of this? Especially it doesn't
>>> prevent any string concatenation and also the if checks are already done by
>>> the library itself right?
>>>
>>> Thanks!
>>>
>>>
>


Re: Question on Spark code

2017-06-25 Thread kant kodali
@Sean Got it! I come from Java world so I guess I was wrong in assuming
that arguments are evaluated during the method invocation time. How about
the conditional checks to see if the log is InfoEnabled or DebugEnabled?
For Example,

if (log.isInfoEnabled) log.info(msg)

I hear we should use guard condition only when the string "msg"
construction is expensive otherwise we will be taking a performance hit
because of the additional "if" check unless the "log" itself is declared
static final and scala compiler will strip away the "if" check and produce
efficient byte code. Also log.info does the log.isInfoEnabled check inside
the body prior to logging the messages.

https://github.com/qos-ch/slf4j/blob/master/slf4j-simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L509
https://github.com/qos-ch/slf4j/blob/master/slf4j-simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L599

Please correct me if I am wrong.




On Sun, Jun 25, 2017 at 3:04 AM, Sean Owen  wrote:

> Maybe you are looking for declarations like this. "=> String" means the
> arg isn't evaluated until it's used, which is just what you want with log
> statements. The message isn't constructed unless it will be logged.
>
> protected def logInfo(msg: => String) {
>
>
> On Sun, Jun 25, 2017 at 10:28 AM kant kodali  wrote:
>
>> Hi All,
>>
>> I came across this file https://github.com/apache/spark/blob/master/core/
>> src/main/scala/org/apache/spark/internal/Logging.scala and I am
>> wondering what is the purpose of this? Especially it doesn't prevent any
>> string concatenation and also the if checks are already done by the library
>> itself right?
>>
>> Thanks!
>>
>>


Re: Question on Spark code

2017-06-25 Thread Sean Owen
Maybe you are looking for declarations like this. "=> String" means the arg
isn't evaluated until it's used, which is just what you want with log
statements. The message isn't constructed unless it will be logged.

protected def logInfo(msg: => String) {

On Sun, Jun 25, 2017 at 10:28 AM kant kodali  wrote:

> Hi All,
>
> I came across this file
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/Logging.scala
> and I am wondering what is the purpose of this? Especially it doesn't
> prevent any string concatenation and also the if checks are already done by
> the library itself right?
>
> Thanks!
>
>


Re: Question on Spark code

2017-06-25 Thread Herman van Hövell tot Westerflier
I am not getting the question. The logging trait does exactly what is says
on the box, I don't see what string concatenation has to do with it.

On Sun, Jun 25, 2017 at 11:27 AM, kant kodali  wrote:

> Hi All,
>
> I came across this file https://github.com/apache/spark/blob/master/core/
> src/main/scala/org/apache/spark/internal/Logging.scala and I am wondering
> what is the purpose of this? Especially it doesn't prevent any string
> concatenation and also the if checks are already done by the library itself
> right?
>
> Thanks!
>
>


-- 

Herman van Hövell

Software Engineer

Databricks Inc.

hvanhov...@databricks.com

+31 6 420 590 27

databricks.com

[image: http://databricks.com] 



[image: Announcing Databricks Serverless. The first serverless data science
and big data platform. Watch the demo from Spark Summit 2017.]



Question on Spark code

2017-06-25 Thread kant kodali
Hi All,

I came across this file
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/Logging.scala
and I am wondering what is the purpose of this? Especially it doesn't
prevent any string concatenation and also the if checks are already done by
the library itself right?

Thanks!