uzadude opened a new pull request #5: [DATAFU-148] Spark Support
URL: https://github.com/apache/datafu/pull/5
Creating this PR to consolidate all the realted changes to first version
that supports Spark.
[DATAFU-148](https://issues.apache.org/jira/browse/DATAFU-148)
uzadude commented on issue #5: [DATAFU-148] Spark Support
URL: https://github.com/apache/datafu/pull/5#issuecomment-464613149
re-opening after adding ScalaPythonBridge functionallity
This is an automated message from the Apach
rjurney opened a new pull request #15: Add Spark functionality to DataFu,
datafu-spark
URL: https://github.com/apache/datafu/pull/15
Need the pull request to review/discuss
This is an automated message from the Apache Git Ser
eyala commented on issue #5: [DATAFU-148] Spark Support
URL: https://github.com/apache/datafu/pull/5#issuecomment-485240762
Merged the commit from Feb 18th
This is an automated message from the Apache Git Service.
To respond t
eyala commented on issue #15: Add Spark functionality to DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#issuecomment-488644759
I didn't take care of all of Russell's comments, but I had to make some
changes in the tests in order for them to pass in all the Spark/Scala ve
uzadude closed pull request #5: [DATAFU-148] Spark Support
URL: https://github.com/apache/datafu/pull/5
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
uzadude commented on issue #15: Add Spark functionality to DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#issuecomment-492920942
@rjurney, I've reformatted all the code with Spark's scalastyle.xml. I
believe this should solve most of the code style issues.
-
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286187133
##
File path: datafu-spark/README.md
##
@@ -0,0 +1,71 @@
+# datafu-spark
+
+datafu-spark contains a numb
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286242193
##
File path:
datafu-spark/src/main/scala/spark/utils/overwrites/SparkPythonRunner.scala
##
@@ -0,0 +1,
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286238319
##
File path: datafu-spark/src/test/resources/python_tests/pyfromscala.py
##
@@ -0,0 +1,92 @@
+# License
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286237816
##
File path:
datafu-spark/src/test/resources/python_tests/pyfromscala_with_error.py
##
@@ -0,0 +1,18 @
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286243461
##
File path: datafu-spark/build.gradle
##
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foun
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286236660
##
File path:
datafu-spark/src/test/resources/META-INF/services/datafu.spark.PythonResource
##
@@ -0,0
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286192534
##
File path: datafu-spark/build_and_test_spark.sh
##
@@ -0,0 +1,115 @@
+# Licensed to the Apache Softwa
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286240661
##
File path: datafu-spark/src/main/scala/datafu/spark/ScalaPythonBridge.scala
##
@@ -0,0 +1,166 @@
+/*
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286187579
##
File path: datafu-spark/README.md
##
@@ -0,0 +1,71 @@
+# datafu-spark
+
+datafu-spark contains a numb
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286237649
##
File path: datafu-spark/src/test/resources/python_tests/df_utils_tests.py
##
@@ -0,0 +1,88 @@
+# Lice
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286197035
##
File path: datafu-spark/src/main/resources/META-INF/LICENSE
##
@@ -0,0 +1,393 @@
+
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286197225
##
File path: datafu-spark/src/main/resources/META-INF/NOTICE
##
@@ -0,0 +1,60 @@
+Apache DataFu
Revi
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286243214
##
File path: datafu-spark/README.md
##
@@ -0,0 +1,71 @@
+# datafu-spark
+
+datafu-spark contains a numb
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286188718
##
File path: datafu-spark/build.gradle
##
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foun
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286244813
##
File path: datafu-spark/src/main/scala/datafu/spark/DataFrameOps.scala
##
@@ -0,0 +1,92 @@
+/*
+ * Li
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286233883
##
File path: datafu-spark/src/main/resources/pyspark_utils/df_utils.py
##
@@ -0,0 +1,161 @@
+# Licensed
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r286234807
##
File path: datafu-spark/src/main/resources/pyspark_utils/init_spark_context.py
##
@@ -0,0 +1,21 @@
+#
rjurney commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r287922880
##
File path: datafu-spark/build_and_test_spark.sh
##
@@ -0,0 +1,115 @@
+# Licensed to the Apache Software
rjurney commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r287923002
##
File path: datafu-spark/README.md
##
@@ -0,0 +1,71 @@
+# datafu-spark
+
+datafu-spark contains a number
rjurney commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r287923515
##
File path: datafu-spark/src/main/scala/datafu/spark/DataFrameOps.scala
##
@@ -0,0 +1,92 @@
+/*
+ * Lice
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r288963450
##
File path: datafu-spark/README.md
##
@@ -0,0 +1,71 @@
+# datafu-spark
+
+datafu-spark contains a number o
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r288965080
##
File path:
datafu-spark/src/test/resources/META-INF/services/datafu.spark.PythonResource
##
@@ -0,0 +1,2
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289000541
##
File path: datafu-spark/build.gradle
##
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundati
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289000407
##
File path: datafu-spark/README.md
##
@@ -0,0 +1,71 @@
+# datafu-spark
+
+datafu-spark contains a number o
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289000478
##
File path: datafu-spark/README.md
##
@@ -0,0 +1,71 @@
+# datafu-spark
+
+datafu-spark contains a number o
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289003198
##
File path: datafu-spark/build_and_test_spark.sh
##
@@ -0,0 +1,115 @@
+# Licensed to the Apache Software F
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289003243
##
File path: datafu-spark/src/main/resources/META-INF/LICENSE
##
@@ -0,0 +1,393 @@
+
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289003317
##
File path: datafu-spark/src/main/resources/META-INF/NOTICE
##
@@ -0,0 +1,60 @@
+Apache DataFu
Review c
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289003417
##
File path:
datafu-spark/src/test/resources/python_tests/pyfromscala_with_error.py
##
@@ -0,0 +1,18 @@
+#
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289003367
##
File path: datafu-spark/src/test/resources/python_tests/df_utils_tests.py
##
@@ -0,0 +1,88 @@
+# Licensed
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289003455
##
File path: datafu-spark/src/test/resources/python_tests/pyfromscala.py
##
@@ -0,0 +1,92 @@
+# Licensed to
uzadude commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289847779
##
File path: datafu-spark/src/main/resources/pyspark_utils/df_utils.py
##
@@ -0,0 +1,161 @@
+# Licensed t
uzadude commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289848210
##
File path: datafu-spark/src/main/resources/pyspark_utils/init_spark_context.py
##
@@ -0,0 +1,21 @@
+# L
uzadude commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289851400
##
File path: datafu-spark/src/main/scala/datafu/spark/ScalaPythonBridge.scala
##
@@ -0,0 +1,166 @@
+/*
+
uzadude commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289857613
##
File path:
datafu-spark/src/main/scala/spark/utils/overwrites/SparkPythonRunner.scala
##
@@ -0,0 +1,13
uzadude commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r289864950
##
File path: datafu-spark/src/main/scala/datafu/spark/DataFrameOps.scala
##
@@ -0,0 +1,92 @@
+/*
+ * Lice
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r290209031
##
File path: datafu-spark/build.gradle
##
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundati
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r292371000
##
File path:
datafu-spark/src/test/resources/META-INF/services/datafu.spark.PythonResource
##
@@ -0,0 +1,2
uzadude commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r293340319
##
File path: datafu-spark/build_and_test_spark.sh
##
@@ -0,0 +1,115 @@
+# Licensed to the Apache Software
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r301844121
##
File path: datafu-spark/src/main/resources/pyspark_utils/init_spark_context.py
##
@@ -0,0 +1,21 @@
+#
matthayes commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r301844723
##
File path:
datafu-spark/src/test/resources/META-INF/services/datafu.spark.PythonResource
##
@@ -0,0
matthayes commented on issue #15: Add Spark functionality to DataFu,
datafu-spark
URL: https://github.com/apache/datafu/pull/15#issuecomment-509865726
+1
I reviewed the recent code changes and these look good to me. I am able to
build the JAR via `assemble`.
However I did ru
matthayes commented on issue #15: Add Spark functionality to DataFu,
datafu-spark
URL: https://github.com/apache/datafu/pull/15#issuecomment-509866615
Okay I think I get what's going on. SparkPythonRunner must be assuming that
Python 2.x is being used, however I am using Python 3.6. When
matthayes commented on issue #15: Add Spark functionality to DataFu,
datafu-spark
URL: https://github.com/apache/datafu/pull/15#issuecomment-509866739
Anyways from my perspective I think we're good to merge in. @rjurney any
other comments?
eyala commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r302037973
##
File path:
datafu-spark/src/test/resources/META-INF/services/datafu.spark.PythonResource
##
@@ -0,0 +1,2
rjurney commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r303248340
##
File path: datafu-spark/src/main/resources/pyspark_utils/bridge_utils.py
##
@@ -0,0 +1,72 @@
+# License
rjurney commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r303248625
##
File path: datafu-spark/src/main/resources/pyspark_utils/df_utils.py
##
@@ -0,0 +1,171 @@
+# Licensed t
rjurney commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r303248700
##
File path: datafu-spark/src/main/resources/pyspark_utils/df_utils.py
##
@@ -0,0 +1,171 @@
+# Licensed t
rjurney commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r303248700
##
File path: datafu-spark/src/main/resources/pyspark_utils/df_utils.py
##
@@ -0,0 +1,171 @@
+# Licensed t
uzadude commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r303910701
##
File path: datafu-spark/src/main/resources/pyspark_utils/bridge_utils.py
##
@@ -0,0 +1,72 @@
+# License
uzadude commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r303911042
##
File path: datafu-spark/src/main/resources/pyspark_utils/df_utils.py
##
@@ -0,0 +1,171 @@
+# Licensed t
uzadude commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r303911500
##
File path: datafu-spark/src/main/resources/pyspark_utils/df_utils.py
##
@@ -0,0 +1,171 @@
+# Licensed t
rjurney commented on a change in pull request #15: Add Spark functionality to
DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#discussion_r304139883
##
File path: datafu-spark/src/main/resources/pyspark_utils/df_utils.py
##
@@ -0,0 +1,171 @@
+# Licensed t
eyala commented on issue #15: Add Spark functionality to DataFu, datafu-spark
URL: https://github.com/apache/datafu/pull/15#issuecomment-512185721
Merged
This is an automated message from the Apache Git Service.
To respond to
matthayes commented on issue #15: Add Spark functionality to DataFu,
datafu-spark
URL: https://github.com/apache/datafu/pull/15#issuecomment-512481089
Russell can you close this now since it is merged already?
This is an auto
uzadude opened a new pull request #16: [DATAFU-153] Add support for Python 3
URL: https://github.com/apache/datafu/pull/16
## Summary
with minor code changes, we can easily support also python 3.
This is an automated messag
matthayes commented on issue #16: [DATAFU-153] Add support for Python 3
URL: https://github.com/apache/datafu/pull/16#issuecomment-606787541
Already merged separately.
This is an automated message from the Apache Git Service.
matthayes closed pull request #15: Add Spark functionality to DataFu,
datafu-spark
URL: https://github.com/apache/datafu/pull/15
This is an automated message from the Apache Git Service.
To respond to the message, please log
matthayes closed pull request #16: [DATAFU-153] Add support for Python 3
URL: https://github.com/apache/datafu/pull/16
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitH
matthayes closed pull request #1: AhoCorasickMatch UDF with unit tests
URL: https://github.com/apache/datafu/pull/1
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
matthayes commented on issue #2: Added UDF ZipBags which can zip and arbitrary
number of bags into one
URL: https://github.com/apache/datafu/pull/2#issuecomment-606788461
Already merged
This is an automated message from the A
matthayes closed pull request #2: Added UDF ZipBags which can zip and arbitrary
number of bags into one
URL: https://github.com/apache/datafu/pull/2
This is an automated message from the Apache Git Service.
To respond to the
matthayes commented on issue #3: Enhance InUDF to support tuple version and add
java compatibility for datafu-pig
URL: https://github.com/apache/datafu/pull/3#issuecomment-606788743
Closing this as JIRA was filed and resolved as won't fix.
--
matthayes closed pull request #3: Enhance InUDF to support tuple version and
add java compatibility for datafu-pig
URL: https://github.com/apache/datafu/pull/3
This is an automated message from the Apache Git Service.
To res
XinyuLiu5566 opened a new pull request #17:
URL: https://github.com/apache/datafu/pull/17
From the CodeGuru report, Similar code fragments were detected in the same
file at the following lines: 270:284, 295:309.
I refactored the code to remove duplicates.
--
This is an automated messa
eyala commented on pull request #17:
URL: https://github.com/apache/datafu/pull/17#issuecomment-939903175
Hello! I'm glad to see your contributions. I went over the first commit
before you added more (it looks fine), but I see you've made more changes - we
might want to split them into the
eyala opened a new pull request #18:
URL: https://github.com/apache/datafu/pull/18
1. Fixes codeql build by upgrading the Gradle version used for the Gradle
wrapper
2. Updates libraries used by website
3. Replaces some http urls with https
--
This is an automated message from the A
eyala merged pull request #18:
URL: https://github.com/apache/datafu/pull/18
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@dataf
XinyuLiu5566 commented on pull request #17:
URL: https://github.com/apache/datafu/pull/17#issuecomment-969387640
Hi sorry for the late reply! How do you want me to split the changes?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to G
eyala commented on pull request #17:
URL: https://github.com/apache/datafu/pull/17#issuecomment-970103773
You can divide it by project - maybe leave the datafu-pig commits here, and
make a new PR with the datafu-hourglass ones.
How did you test this? When I try to build your branch I
XinyuLiu5566 commented on pull request #17:
URL: https://github.com/apache/datafu/pull/17#issuecomment-989480453
After checking the document of lombok.NonNull, I think it's better to leave
the code unchanged so I will open another PR without lombok.NonNul.
--
This is an automated message
eyala commented on pull request #17:
URL: https://github.com/apache/datafu/pull/17#issuecomment-1008300825
You can either open a new PR or just remove the change from this one, either
option is fine.
But are you successfully running all the tests? If you have changes that
aren't exe
dependabot[bot] opened a new pull request #19:
URL: https://github.com/apache/datafu/pull/19
Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.12.5 to
1.13.2.
Release notes
Sourced from https://github.com/sparklemotion/nokogiri/releases";>nokogiri's
releases.
eyala merged pull request #19:
URL: https://github.com/apache/datafu/pull/19
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@dataf
dependabot[bot] opened a new pull request, #20:
URL: https://github.com/apache/datafu/pull/20
Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.13.2 to
1.13.4.
Release notes
Sourced from https://github.com/sparklemotion/nokogiri/releases";>nokogiri's
releases.
eyala merged PR #20:
URL: https://github.com/apache/datafu/pull/20
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@datafu.apache.org
Fo
petrpulc opened a new pull request, #21:
URL: https://github.com/apache/datafu/pull/21
The filter at the end in fact causes the join to behave like 'inner' because
it filters out the records from singleDf that have no matching range... because
range_start and range_end are null in that case
eyala commented on PR #21:
URL: https://github.com/apache/datafu/pull/21#issuecomment-1099986584
I think you're correct in your analysis - this does make the join basically
an inner join. There are two issues that need to be addressed before we can
merge this, one theoretical and one practi
uzadude commented on PR #21:
URL: https://github.com/apache/datafu/pull/21#issuecomment-113021
sure, let's add a `joinType` parameter like in the skew join methods. let's
keep it backward compatible.
--
This is an automated message from the Apache Git Service.
To respond to the messag
petrpulc closed pull request #21: Keep the unmatched single records in
joinWithRange
URL: https://github.com/apache/datafu/pull/21
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
petrpulc commented on PR #21:
URL: https://github.com/apache/datafu/pull/21#issuecomment-52
Hi, I agree with your suggestions, the initial change set was just to spark
(pun intended) the discussion and as my braindump if someone would like to take
the issue faster than I was able to
petrpulc closed pull request #21: Keep the unmatched single records in
joinWithRange
URL: https://github.com/apache/datafu/pull/21
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
petrpulc commented on PR #21:
URL: https://github.com/apache/datafu/pull/21#issuecomment-101928
Well, during testing I actually found a pretty serious issue... if the
record falls into `decreased_range_single`, but `range_start` and `range_end`
does not contain `single` then I would nee
eyala commented on PR #21:
URL: https://github.com/apache/datafu/pull/21#issuecomment-1127673751
I think you're right. I would say that it's still worth doing ... but if
there are multiple records with the same "key" (the column provided as
_single_) I don't see how the records without a ra
dependabot[bot] opened a new pull request, #22:
URL: https://github.com/apache/datafu/pull/22
Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.13.4 to
1.13.5.
Release notes
Sourced from https://github.com/sparklemotion/nokogiri/releases";>nokogiri's
releases.
eyala merged PR #22:
URL: https://github.com/apache/datafu/pull/22
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@datafu.apache.org
Fo
dependabot[bot] opened a new pull request, #23:
URL: https://github.com/apache/datafu/pull/23
Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.13.5 to
1.13.6.
Release notes
Sourced from https://github.com/sparklemotion/nokogiri/releases";>nokogiri's
releases.
eyala merged PR #23:
URL: https://github.com/apache/datafu/pull/23
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@datafu.apache.org
Fo
eyala commented on PR #21:
URL: https://github.com/apache/datafu/pull/21#issuecomment-1167388998
If you want to submit just your test cases, I've made [a JIRA issue for
generic test improvements](https://issues.apache.org/jira/browse/DATAFU-164).
--
This is an automated message from the A
benraha opened a new pull request, #24:
URL: https://github.com/apache/datafu/pull/24
Added collectLimitedList, A UDAF, which is like collect_list, but receives a
parameter that limits the number of items to be collected, chosen randomly.
This is useful when one wants to collect items
uzadude opened a new pull request, #25:
URL: https://github.com/apache/datafu/pull/25
# Summary
- Adding a register UDFs annotation to conveniently register handy UDFs.
- Adding a few simple handy UDFs
--
This is an automated message from the Apache Git Service.
To respond to the
uzadude commented on PR #25:
URL: https://github.com/apache/datafu/pull/25#issuecomment-1181744112
@eyala please check this out.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comme
eyala opened a new pull request, #26:
URL: https://github.com/apache/datafu/pull/26
Belatedly make sure our testing script uses the latest Spark versions.
Hopefully this will also activate the test-on-pr action.
--
This is an automated message from the Apache Git Service.
To respond to th
1 - 100 of 134 matches
Mail list logo