[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user ejsarge-gr commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-118378860 Would we be able to document this better or re-examine it a little? The problem is that log4j 1.2.17 is shaded into the assembly jar. Thus, to override it we need to get our jars listed at the start of the classpath. However extraClassPath says that it *appends* to the classpath (instead of prepending). SPARK_CLASSPATH appears to be mostly [deprecated|http://mail-archives.us.apache.org/mod_mbox/spark-user/201503.mbox/%3C01a901d0547c$a23ba480$e6b2ed80$@innowireless.com%3E]. Log4j2 support is essentially required if you want to send the logs to centralised logging via Flume. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user liorchaga closed the pull request at: https://github.com/apache/spark/pull/4998 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user liorchaga commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-81544184 I'm withdrawing the pull request. We do not want to break compatibility to the community. Instead, we are using SPARK_CLASSPATH to add the log4j 2.x and log4j1.2-api jars. Seems to work properly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user liorchaga commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-80944189 @srowen , the 1.2 - 2.x bridge should work (I would verify this today). But keep in mind it would require migrating log4j configuration to 2.x. Are we sure we want to take this course? If we just exclude the log4j 1.2 implementation dependency (from default profile or with a new profile), we would maintain compatibility with existing spark deployments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-80938223 @tsliwowicz Have a read above, where I'm describing why this is more than just a build profile. My principal concern is as I say above: if log4j 1.2 and 2 are mutually incompatible, then making a build with this profile means that _user apps_ become mutually incompatible between this and a normal build of Spark. That seems too surprising to advertise as just another build option like the others. It's not really a solution. (Doesn't this leave 1.2 and 2 on the runtime classpath? how does that end up working when the assembly is created?) I'm not clear if my suggestion above has been tried: what happens when log4j 2 and log4j 2's 1.2-to-2 bridge artifacts are built into Spark? This is _not_ the same as just putting log4j 1.2 and 2 on the classpath. It seems like log4j 2 is trying to provide this as a way to do both at once. I don't know if it works, but that *might* be a solution; config format may then be the problem. This isn't a simple question of upgrading Spark for several reasons. Spark runs user code and is prone to classpath leakage; Spark's choice unfortunately affects user apps, unlike user apps. Shading is normally the answer to this, and some key dependencies have been shaded to get them out of the way; the userClassPathFirst mechanism is also a means to avoid some collisions. Logging frameworks like log4j are however particularly tricky to deal with: lots of packages use logging frameworks, and they want to be configured once, in one file. Shading and/or multiple classloaders interfere with this. Unless something like the above works, the state of things seems to be: use slf4j in your app, or use log4j 1.2, or see if you can shade and separately configure your logging in your app. My guess is that at Spark 2.x, a lot of dependencies can and will be updated, so log4j either goes away or moves to 2.x. And/or we figure out a way to shade it that doesn't break it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user liorchaga commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-80979001 log4j12-api bridge is working properly. Our executor code is using log4j 1.2.7, and this jar is included in the jars provided to sparkContext. The spark distribution itself is shipped with dependency in log4j12-api (instead of log4j12). The executor logging is delegated to log4j 2.x successfully. As for the configuration - the syntax is not backwards compatible. Even the configuration property is changed (log4j.configuration for 1.2 vs. log4jConfigurationFile for 2.x). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user tsliwowicz commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-80878911 @srowen We don't know of an option to run side by side with two log4j versions. It conflicts on both slf4j and log4j classes. In any case, I believe it won't create a bigger problem because it just adds a build option. How is that a problem at all? For us, the log4j version is fixed for all drivers. The rest of the organisation switched to log4j 2, and we would like to do the same for Spark. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-80947530 Yeah that's what I am hoping might work, to remove log4j 1.2 and replace with log4j 2 + 1.2-to-2 bridge, and also use the slfj4-to-log4j2 bridge, and update Spark itself to use log4j 2. The idea is that apps that expect to find log4j 1.2 still find its API. It's still somewhat problematic if user logging config stops working. Is it not backwards-compatible / accepts the old format too? I am also worried this somehow interacts badly with classloaders, but it stands a chance of working. But if it works well I mean that the main build would do this, not a profile. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user tsliwowicz commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-78868836 @srowen this is in essence the same as the ability to control hadoop version only through classpath manipulation. Since there are many flavors, its being done by using different profiles. While Spark may choose to leave the default log4j support for 1.2, the only way to fully switch to log4j 2 is via changing the classpath. Using the same logic that was applied by the spark designer for hadoop, @liorchaga created the profile for log4j 2. Adding a build profile does not complicate things because the default (that is not specifying the profile) means the build stays the same. Only those who need it, will use the profile. Same as for various hadoop versions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-78893597 I think the difference is that Hadoop profiles build Spark for different, but fixed, deployment contexts. This is trying to accommodate two different types of _user_ app. This build of Spark won't run apps that normal builds will, and vice versa (right?). This is why I don't think a build profile helps people solve this problem (without creating a bigger one). I'd really hope there's a flavor of log4j that supports old and new code and config; that would be much more ideal. Or, another answer is just that you can't use log4j 2 with Spark and need to use slf4j. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-78476242 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-78500551 Does it actually cause a problem? (I wouldn't be surprised, but I'm just asking.) Can you use slf4j, or do you need log4j 2? can you use the log4j 2 - log4j 1.2 adapter? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4998#discussion_r26318113 --- Diff: pom.xml --- @@ -1706,6 +1706,43 @@ jline.groupidjline/jline.groupid /properties /profile +profile +idlog4j2/id +activation +propertynamelog4j2/name/property +/activation +properties +log4j2.version2.1/log4j2.version +/properties +dependencies --- End diff -- Hm. I wonder what would happen if Spark always used log4j 2.x, and then included the 1.2 - 2 bridge code just in case an app relies on Spark providing 1.2? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-78498110 No, we do not want a new profile for this. What problem are you trying to solve? Spark uses SLF4J, not a particular logging framework. However it activates log4j 1 by default, and uses log4j 1 to control logging levels. This shouldn't change. You should be able to ultimately get log4j 2 to work in your build though with enough exclusion and rewiring. There is not a reason that Spark needs to also use several version of log4j itself. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user liorchaga commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-78502617 The problem is having both log4j-over-slf4j and slf4j-log4j12 in the classpath. To get it working with log4j 1.2 we need to exclude slf4j-log4j12. Also, having both log4j-1.2.x and log4j-core-2.x in the classpath is a problem because both have class org.apache.log4j.Logger. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-78513631 Hm, the bridge I was thinking of is for 1.2 - 2, not the other way around. I was thinking of http://logging.apache.org/log4j/log4j-2.1/log4j-to-slf4j/index.html as well but I imagine that has the class conflict problem. Does `extraClassPath` make it happen to work though? your user code would see log4j 2, but so would Spark but that might happen to work? log4j isn't used much directly by Spark. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user liorchaga commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-78525274 On second thought, correct me if I'm wrong, in such case spark users would have to migrate their log config. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-78534488 @liorchaga I was thinking that Spark would just use log4j 2, and include the log4j 1.2 bridge, so that no matter what, user apps would see the classes they need. I don't know if it means users would have to migrate config -- would log4j 2 + bridge not let them use their current config? Otherwise, this is a pretty tough one. It's another symptom of classpath leakage, although this one is harder to deal with than usual. I'm hesitant to complicate things with another build profile, because I don't think Spark would make releases with log4j 2 flavors just for this anyway. The thing is, your app would work on your specially-built Spark but not others, and perhaps vice versa. It's a sticky one for sure. If anyone has other approaches, maybe post them on the JIRA discussion? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user liorchaga commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-78523952 This would require all existing spark deployments to add the bridge to the classpath (or to rebuild the distribution with a profile that includes this dependency), unless they change their config to 2.x. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user liorchaga commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-78499766 We are using log4j 2.x in our executor code. I tried adding log4j 2.x dependencies using spark.executor.extraClassPath, but then we have both log4j 1.2 and 2.x in our classpath (1.2 is shaded in spark-assembly). Maybe I'm missing something here, but as I understand I must exclude the log4j 1.2 dependencies to get it working properly. We don't mind having a profile for just excluding the log4j 1.2 dependencies, and than adding the 2.x jars with extraClassPath, if that is more acceptable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
Github user liorchaga commented on the pull request: https://github.com/apache/spark/pull/4998#issuecomment-78515735 Tried it with extraClassPath. I provided through SPARK_EXECUTOR_OPTS the log4j.configurationFile property for Log4j 2.x. In the log config, I declared two file appenders - one for all loggers with debug level, and another one to our executor code (also with debug level). Both log files were created, but only the first (for all classes) had data in it, and it contained logging messages for spark code, not for my code. The second file was empty. Once I excluded log4j 1.2 dependencies, I got messages for the executor code as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6305] add log4j2 profile and prevent lo...
GitHub user liorchaga opened a pull request: https://github.com/apache/spark/pull/4998 [SPARK-6305] add log4j2 profile and prevent log4j 1.2 shading * log4j and slf4j-log4j12 compile dependencies in submodules removed * log4j2 profile added with log4h12 and slf4j-log4j12 dependencies declared with provided scope (to avoid transitive dependencies when shading), and log4j2 dependencies declared with compile scope * updated documentation You can merge this pull request into a Git repository by running: $ git pull https://github.com/taboola/spark master-log4j2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4998.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4998 commit 41e3e3506e6bf123864606cf7ee4888283ce6bc8 Author: Lior Chaga lio...@taboola.com Date: 2015-03-12T12:05:10Z * log4j and slf4j-log4j12 compile dependencies in submodules removed * log4j2 profile added with log4h12 and slf4j-log4j12 dependencies declared with provided scope (to avoid transitive dependencies when shading), and log4j2 dependencies declared with compile scope * updated documentation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org