[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-07-03 Thread ejsarge-gr
Github user ejsarge-gr commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-118378860
  
Would we be able to document this better or re-examine it a little?

The problem is that log4j 1.2.17 is shaded into the assembly jar. Thus, to 
override it we need to get our jars listed at the start of the classpath. 
However extraClassPath says that it *appends* to the classpath (instead of 
prepending).

SPARK_CLASSPATH appears to be mostly 
[deprecated|http://mail-archives.us.apache.org/mod_mbox/spark-user/201503.mbox/%3C01a901d0547c$a23ba480$e6b2ed80$@innowireless.com%3E].

Log4j2 support is essentially required if you want to send the logs to 
centralised logging via Flume.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-16 Thread liorchaga
Github user liorchaga closed the pull request at:

https://github.com/apache/spark/pull/4998


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-16 Thread liorchaga
Github user liorchaga commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-81544184
  
I'm withdrawing the pull request.
We do not want to break compatibility to the community.
Instead, we are using SPARK_CLASSPATH to add the log4j 2.x and log4j1.2-api 
jars. Seems to work properly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-15 Thread liorchaga
Github user liorchaga commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-80944189
  
@srowen , the 1.2 - 2.x bridge should work (I would verify this today). 
But keep in mind it would require migrating log4j configuration to 2.x. Are we 
sure we want to take this course?
If we just exclude the log4j 1.2 implementation dependency (from default 
profile or with a new profile), we would maintain compatibility with existing 
spark deployments. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-15 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-80938223
  
@tsliwowicz Have a read above, where I'm describing why this is more than 
just a build profile. My principal concern is as I say above: if log4j 1.2 and 
2 are mutually incompatible, then making a build with this profile means that 
_user apps_ become mutually incompatible between this and a normal build of 
Spark. That seems too surprising to advertise as just another build option like 
the others. It's not really a solution.

(Doesn't this leave 1.2 and 2 on the runtime classpath? how does that end 
up working when the assembly is created?)

I'm not clear if my suggestion above has been tried: what happens when 
log4j 2 and log4j 2's 1.2-to-2 bridge artifacts are built into Spark? This is 
_not_ the same as just putting log4j 1.2 and 2 on the classpath. It seems like 
log4j 2 is trying to provide this as a way to do both at once. I don't know if 
it works, but that *might* be a solution; config format may then be the problem.

This isn't a simple question of upgrading Spark for several reasons. Spark 
runs user code and is prone to classpath leakage; Spark's choice unfortunately 
affects user apps, unlike user apps. Shading is normally the answer to this, 
and some key dependencies have been shaded to get them out of the way; the 
userClassPathFirst mechanism is also a means to avoid some collisions. 

Logging frameworks like log4j are however particularly tricky to deal with: 
lots of packages use logging frameworks, and they want to be configured once, 
in one file. Shading and/or multiple classloaders interfere with this.

Unless something like the above works, the state of things seems to be: use 
slf4j in your app, or use log4j 1.2, or see if you can shade and separately 
configure your logging in your app.

My guess is that at Spark 2.x, a lot of dependencies can and will be 
updated, so log4j either goes away or moves to 2.x. And/or we figure out a way 
to shade it that doesn't break it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-15 Thread liorchaga
Github user liorchaga commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-80979001
  
log4j12-api bridge is working properly. 
Our executor code is using log4j 1.2.7, and this jar is included in the 
jars provided to sparkContext. The spark distribution itself is shipped with 
dependency in log4j12-api (instead of log4j12). The executor logging is 
delegated to log4j 2.x successfully.

As for the configuration - the syntax is not backwards compatible. Even the 
configuration property is changed (log4j.configuration for 1.2 vs. 
log4jConfigurationFile for 2.x).





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-15 Thread tsliwowicz
Github user tsliwowicz commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-80878911
  
@srowen We don't know of an option to run side by side with two log4j 
versions. It conflicts on both slf4j and log4j classes. In any case, I believe 
it won't create a bigger problem because it just adds a build option. How is 
that a problem at all? For us, the log4j version is fixed for all drivers. The 
rest of the organisation switched to log4j 2, and we would like to do the same 
for Spark. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-15 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-80947530
  
Yeah that's what I am hoping might work, to remove log4j 1.2 and replace 
with log4j 2 + 1.2-to-2 bridge, and also use the slfj4-to-log4j2 bridge, and 
update Spark itself to use log4j 2. The idea is that apps that expect to find 
log4j 1.2 still find its API. 

It's still somewhat problematic if user logging config stops working. Is it 
not backwards-compatible / accepts the old format too? I am also worried this 
somehow interacts badly with classloaders, but it stands a chance of working.

But if it works well I mean that the main build would do this, not a 
profile.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-13 Thread tsliwowicz
Github user tsliwowicz commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-78868836
  
@srowen this is in essence the same as the ability to control hadoop 
version only through classpath manipulation. Since there are many flavors, its 
being done by using different profiles. While Spark may choose to leave the 
default log4j support for 1.2, the only way to fully switch to log4j 2 is via 
changing the classpath. Using the same logic that was applied by the spark 
designer for hadoop, @liorchaga created the profile for log4j 2. Adding a build 
profile does not complicate things because the default (that is not specifying 
the profile) means the build stays the same. Only those who need it, will use 
the profile. Same as for various hadoop versions. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-13 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-78893597
  
I think the difference is that Hadoop profiles build Spark for different, 
but fixed, deployment contexts. This is trying to accommodate two different 
types of _user_ app. This build of Spark won't run apps that normal builds 
will, and vice versa (right?). This is why I don't think a build profile helps 
people solve this problem (without creating a bigger one).

I'd really hope there's a flavor of log4j that supports old and new code 
and config; that would be much more ideal. Or, another answer is just that you 
can't use log4j 2 with Spark and need to use slf4j.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-78476242
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-12 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-78500551
  
Does it actually cause a problem? (I wouldn't be surprised, but I'm just 
asking.) Can you use slf4j, or do you need log4j 2? can you use the log4j 2 - 
log4j 1.2 adapter?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-12 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/4998#discussion_r26318113
  
--- Diff: pom.xml ---
@@ -1706,6 +1706,43 @@
 jline.groupidjline/jline.groupid
   /properties
 /profile
+profile
+idlog4j2/id
+activation
+propertynamelog4j2/name/property
+/activation
+properties
+log4j2.version2.1/log4j2.version
+/properties
+dependencies
--- End diff --

Hm. I wonder what would happen if Spark always used log4j 2.x, and then 
included the 1.2 - 2 bridge code just in case an app relies on Spark providing 
1.2?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-12 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-78498110
  
No, we do not want a new profile for this. What problem are you trying to 
solve? Spark uses SLF4J, not a particular logging framework. However it 
activates log4j 1 by default, and uses log4j 1 to control logging levels. This 
shouldn't change. You should be able to ultimately get log4j 2 to work in your 
build though with enough exclusion and rewiring. There is not a reason that 
Spark needs to also use several version of log4j itself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-12 Thread liorchaga
Github user liorchaga commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-78502617
  
The problem is having both log4j-over-slf4j and slf4j-log4j12 in the 
classpath. To get it working with log4j 1.2 we need to exclude slf4j-log4j12.
Also, having both log4j-1.2.x and log4j-core-2.x in the classpath is a 
problem because both have class org.apache.log4j.Logger.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-12 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-78513631
  
Hm, the bridge I was thinking of is for 1.2 - 2, not the other way around.
I was thinking of 
http://logging.apache.org/log4j/log4j-2.1/log4j-to-slf4j/index.html as well but 
I imagine that has the class conflict problem.
Does `extraClassPath` make it happen to work though? your user code would 
see log4j 2, but so would Spark but that might happen to work? log4j isn't used 
much directly by Spark.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-12 Thread liorchaga
Github user liorchaga commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-78525274
  
On second thought, correct me if I'm wrong, in such case spark users would 
have to migrate their log config.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-12 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-78534488
  
@liorchaga I was thinking that Spark would just use log4j 2, and include 
the log4j 1.2 bridge, so that no matter what, user apps would see the classes 
they need. I don't know if it means users would have to migrate config -- would 
log4j 2 + bridge not let them use their current config?

Otherwise, this is a pretty tough one. It's another symptom of classpath 
leakage, although this one is harder to deal with than usual. I'm hesitant to 
complicate things with another build profile, because I don't think Spark would 
make releases with log4j 2 flavors just for this anyway. The thing is, your app 
would work on your specially-built Spark but not others, and perhaps vice versa.

It's a sticky one for sure. If anyone has other approaches, maybe post them 
on the JIRA discussion?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-12 Thread liorchaga
Github user liorchaga commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-78523952
  
This would require all existing spark deployments to add the bridge to the 
classpath (or to rebuild the distribution with a profile that includes this 
dependency), unless they change their config to 2.x.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-12 Thread liorchaga
Github user liorchaga commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-78499766
  
We are using log4j 2.x in our executor code. I tried adding log4j 2.x 
dependencies using spark.executor.extraClassPath, but then we have both log4j 
1.2 and 2.x in our classpath (1.2 is shaded in spark-assembly).
Maybe I'm missing something here, but as I understand I must exclude the 
log4j 1.2 dependencies to get it working properly. 
We don't mind having a profile for just excluding the log4j 1.2 
dependencies, and than adding the 2.x jars with extraClassPath, if that is more 
acceptable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-12 Thread liorchaga
Github user liorchaga commented on the pull request:

https://github.com/apache/spark/pull/4998#issuecomment-78515735
  
Tried it with extraClassPath. I provided through SPARK_EXECUTOR_OPTS the 
log4j.configurationFile property for Log4j 2.x.
In the log config, I declared two file appenders - one for all loggers with 
debug level, and another one to our executor code (also with debug level).
Both log files were created, but only the first (for all classes) had data 
in it, and it contained logging messages for spark code, not for my code. The 
second file was empty.
Once I excluded log4j 1.2 dependencies, I got messages for the executor 
code as well. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...

2015-03-12 Thread liorchaga
GitHub user liorchaga opened a pull request:

https://github.com/apache/spark/pull/4998

[SPARK-6305] add log4j2 profile and prevent log4j 1.2 shading

* log4j and slf4j-log4j12 compile dependencies in submodules removed
* log4j2 profile added with log4h12 and slf4j-log4j12 dependencies declared 
with provided scope (to avoid transitive dependencies when shading), and log4j2 
dependencies declared with compile scope
* updated documentation

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/taboola/spark master-log4j2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4998.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4998


commit 41e3e3506e6bf123864606cf7ee4888283ce6bc8
Author: Lior Chaga lio...@taboola.com
Date:   2015-03-12T12:05:10Z

* log4j and slf4j-log4j12 compile dependencies in submodules removed
* log4j2 profile added with log4h12 and slf4j-log4j12 dependencies declared 
with provided scope (to avoid transitive dependencies when shading), and log4j2 
dependencies declared with compile scope
* updated documentation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org