skogler commented on pull request #993:
URL: https://github.com/apache/systemds/pull/993#issuecomment-660726427
Okay, its a problem with the combination of maven surefire and JUnit
parameterized tests.
Running the test from IDEA works fine. Removing the parametrization also
makes
skogler commented on pull request #993:
URL: https://github.com/apache/systemds/pull/993#issuecomment-660731660
Yeah, the only way I can find to get the tests to run reliably is to set the
maven surefire plugin option `parallel` to `none`.
j143 commented on pull request #997:
URL: https://github.com/apache/systemds/pull/997#issuecomment-660851656
This can be of interest to @mboehm7 @phaniarnab .
Open for feedback from all the devs. :smile:
This is an
Baunsgaard commented on a change in pull request #993:
URL: https://github.com/apache/systemds/pull/993#discussion_r457212235
##
File path: scripts/staging/entity-resolution/README.md
##
@@ -0,0 +1,99 @@
+# Entity Resolution
+
+## Pipeline design and primitives
+
+We provide
j143 opened a new pull request #997:
URL: https://github.com/apache/systemds/pull/997
* Takes advantage of existing R algorithm scripts used for
codegen testing.
* This would improve the testing by allowing us to provide all
the necessary inputs into the script.
skogler commented on a change in pull request #993:
URL: https://github.com/apache/systemds/pull/993#discussion_r457618489
##
File path: pom.xml
##
@@ -257,12 +257,6 @@
3.0.0-M4
skogler commented on a change in pull request #993:
URL: https://github.com/apache/systemds/pull/993#discussion_r457302869
##
File path:
src/test/java/org/apache/sysds/test/applications/EntityResolutionBinaryTest.java
##
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache
skogler commented on a change in pull request #993:
URL: https://github.com/apache/systemds/pull/993#discussion_r457312560
##
File path: scripts/staging/entity-resolution/README.md
##
@@ -0,0 +1,99 @@
+# Entity Resolution
+
+## Pipeline design and primitives
+
+We provide two
skogler commented on a change in pull request #993:
URL: https://github.com/apache/systemds/pull/993#discussion_r457312560
##
File path: scripts/staging/entity-resolution/README.md
##
@@ -0,0 +1,99 @@
+# Entity Resolution
+
+## Pipeline design and primitives
+
+We provide two
skogler commented on a change in pull request #993:
URL: https://github.com/apache/systemds/pull/993#discussion_r457312560
##
File path: scripts/staging/entity-resolution/README.md
##
@@ -0,0 +1,99 @@
+# Entity Resolution
+
+## Pipeline design and primitives
+
+We provide two
skogler commented on a change in pull request #993:
URL: https://github.com/apache/systemds/pull/993#discussion_r457312560
##
File path: scripts/staging/entity-resolution/README.md
##
@@ -0,0 +1,99 @@
+# Entity Resolution
+
+## Pipeline design and primitives
+
+We provide two
phaniarnab commented on pull request #997:
URL: https://github.com/apache/systemds/pull/997#issuecomment-661749819
This is good. LGTM.
This is an automated message from the Apache Git Service.
To respond to the message,
phaniarnab commented on pull request #992:
URL: https://github.com/apache/systemds/pull/992#issuecomment-661754419
I didn't go through all the comments in the cited PRs, but I'm curious to
see how this Spark/Hadoop upgrade impacts the performance. It might not improve
anything, but also
Baunsgaard commented on a change in pull request #993:
URL: https://github.com/apache/systemds/pull/993#discussion_r457960775
##
File path: scripts/staging/entity-resolution/entity-clustering.dml
##
@@ -0,0 +1,119 @@
Baunsgaard commented on pull request #993:
URL: https://github.com/apache/systemds/pull/993#issuecomment-661746853
LGTM
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
kev-inn commented on a change in pull request #984:
URL: https://github.com/apache/systemds/pull/984#discussion_r458131888
##
File path:
src/test/java/org/apache/sysds/test/functions/misc/DataTypeCastingTest.java
##
@@ -85,10 +85,10 @@ public void testMatrixToMatrix()
j143 commented on pull request #997:
URL: https://github.com/apache/systemds/pull/997#issuecomment-661855002
Thank you.
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
Shafaq-Siddiqi commented on pull request #988:
URL: https://github.com/apache/systemds/pull/988#issuecomment-663111606
> I know this is in progress, but still commenting: is it possible to
replace the `for` blocks with `parfor`?
I tried but there are some matrix dependencies
mboehm7 commented on pull request #996:
URL: https://github.com/apache/systemds/pull/996#issuecomment-663176078
Just for closure - now all raised bugs have been fixed in master. @OsChri
this was a great catch.
This is an
kev-inn commented on a change in pull request #984:
URL: https://github.com/apache/systemds/pull/984#discussion_r458937066
##
File path:
src/main/java/org/apache/sysds/hops/rewrite/RewriteConstantFolding.java
##
@@ -98,13 +98,7 @@ private Hop rConstantFoldingExpression( Hop
kev-inn commented on a change in pull request #984:
URL: https://github.com/apache/systemds/pull/984#discussion_r458943856
##
File path:
src/test/java/org/apache/sysds/test/functions/io/csv/ReadCSVTest1.java
##
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software
kev-inn commented on pull request #980:
URL: https://github.com/apache/systemds/pull/980#issuecomment-662562801
Thanks for the explanation, sounds good.
Leaving configurations up to the user is better in my opinion and I also
agree with your sentiment on the template files.
LGTM :+1:
kev-inn commented on a change in pull request #984:
URL: https://github.com/apache/systemds/pull/984#discussion_r458946615
##
File path: src/test/java/org/apache/sysds/test/AutomatedTestBase.java
##
@@ -978,19 +989,26 @@ protected void runRScript(boolean newWay) {
j143 opened a new pull request #1000:
URL: https://github.com/apache/systemds/pull/1000
* Run SystemDS library loaded cluster, with MLContext.
* This notebook uses scala.
This is an automated message from the Apache Git
j143 opened a new pull request #999:
URL: https://github.com/apache/systemds/pull/999
* Creates a workspace with all the dependencies for project build.
* Helps prototype the DML code in browser.
This is an automated
j143 commented on pull request #1000:
URL: https://github.com/apache/systemds/pull/1000#issuecomment-662627140
Protip: (setting up databricks cluster)
**Step 1:**
![image](https://user-images.githubusercontent.com/53068787/88215390-3295ab00-cc79-11ea-8fe2-f6c748db649f.png)
j143 opened a new pull request #1001:
URL: https://github.com/apache/systemds/pull/1001
- contains some changes related to changes make the code work.
This is an automated message from the Apache Git Service.
To respond to
mandadipavan opened a new pull request #995:
URL: https://github.com/apache/systemds/pull/995
In the example of Univariate Statistics , in the 1st command
runStandaloneSystemDS.sh file missing. It should be changed to systemds.
mboehm7 commented on pull request #991:
URL: https://github.com/apache/systemds/pull/991#issuecomment-661178461
LGTM - thanks @Baunsgaard
This is an automated message from the Apache Git Service.
To respond to the message,
asfgit closed pull request #991:
URL: https://github.com/apache/systemds/pull/991
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
mandadipavan closed pull request #995:
URL: https://github.com/apache/systemds/pull/995
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
mboehm7 commented on pull request #996:
URL: https://github.com/apache/systemds/pull/996#issuecomment-661209935
Also thanks for catching the `eval` issues - I'll fix them in subsequent
commits.
This is an automated message
mboehm7 commented on pull request #996:
URL: https://github.com/apache/systemds/pull/996#issuecomment-661208460
LGTM - thanks for this great new builtin @OsChri. I just slightly changed
the test to use fixed seeds and replaced the for loops for joint sorting of two
matrices with
skogler commented on a change in pull request #993:
URL: https://github.com/apache/systemds/pull/993#discussion_r457582462
##
File path:
src/test/java/org/apache/sysds/test/applications/EntityResolutionBinaryTest.java
##
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache
mboehm7 commented on pull request #984:
URL: https://github.com/apache/systemds/pull/984#issuecomment-663911394
Thanks for the initiative @Baunsgaard. Some of these changes are very good,
on others I'm kind of split. I would recommend we merge it in (with the changes
I made) and see how
mboehm7 edited a comment on pull request #984:
URL: https://github.com/apache/systemds/pull/984#issuecomment-663911394
Thanks for the initiative @Baunsgaard. Some of these changes are very good,
on others I'm kind of split. I would recommend we merge it in (with the changes
I made) and
asfgit closed pull request #984:
URL: https://github.com/apache/systemds/pull/984
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
Baunsgaard opened a new pull request #1021:
URL: https://github.com/apache/systemds/pull/1021
This PR removes the local in file overwriting of logging level, where found.
This is an automated message from the Apache Git
Shafaq-Siddiqi commented on pull request #1146:
URL: https://github.com/apache/systemds/pull/1146#issuecomment-758283020
Hello @AlexanderErtl,
Thank you for your contribution. If you are still working on the Spark
functionality then could you please mark your PR as "WIP" to save it from
Baunsgaard opened a new pull request #1151:
URL: https://github.com/apache/systemds/pull/1151
This commit change the Overlapping matrix to drastically reduce
decompression time in cases of right hand side sparse matrix multiplication.
Other than this many of the methods are cleaned
juliale-15 opened a new pull request #1135:
URL: https://github.com/apache/systemds/pull/1135
@A-Postl and I created this first version of a design document for the
python script generator. We would appreciate feedback if our planned approach
could work like this or not.
gPathpp opened a new pull request #1145:
URL: https://github.com/apache/systemds/pull/1145
Work in progress.
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
AlexanderErtl opened a new pull request #1146:
URL: https://github.com/apache/systemds/pull/1146
CorrectTypos builtin script for ExecType.CP
(ExecType.SPARK not currently functional)
This is an automated message from the
Baunsgaard commented on a change in pull request #1112:
URL: https://github.com/apache/systemds/pull/1112#discussion_r531578960
##
File path:
src/main/java/org/apache/sysds/runtime/instructions/fed/ReorgFEDInstruction.java
##
@@ -50,20 +77,196 @@ public static
Shafaq-Siddiqi commented on a change in pull request #1117:
URL: https://github.com/apache/systemds/pull/1117#discussion_r531556439
##
File path: scripts/builtin/statsNA.dml
##
@@ -0,0 +1,212 @@
+#-
+#
+# Licensed to
Shafaq-Siddiqi commented on a change in pull request #1117:
URL: https://github.com/apache/systemds/pull/1117#discussion_r531554010
##
File path: scripts/builtin/statsNA.dml
##
@@ -0,0 +1,212 @@
+#-
+#
+# Licensed to
Shafaq-Siddiqi commented on a change in pull request #1117:
URL: https://github.com/apache/systemds/pull/1117#discussion_r531568712
##
File path: scripts/builtin/statsNA.dml
##
@@ -0,0 +1,212 @@
+#-
+#
+# Licensed to
Shafaq-Siddiqi commented on a change in pull request #1117:
URL: https://github.com/apache/systemds/pull/1117#discussion_r531558217
##
File path: scripts/builtin/statsNA.dml
##
@@ -0,0 +1,212 @@
+#-
+#
+# Licensed to
Shafaq-Siddiqi commented on a change in pull request #1117:
URL: https://github.com/apache/systemds/pull/1117#discussion_r531566877
##
File path: scripts/builtin/statsNA.dml
##
@@ -0,0 +1,212 @@
+#-
+#
+# Licensed to
haubitzer opened a new pull request #1117:
URL: https://github.com/apache/systemds/pull/1117
**Work in progress**
* no test implemented yet
* open question marked with "TODO"
This is an automated message from the
Baunsgaard merged pull request #1114:
URL: https://github.com/apache/systemds/pull/1114
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
tobiasrieger commented on pull request #1113:
URL: https://github.com/apache/systemds/pull/1113#issuecomment-732810912
I'm talking about the read, that originally reads the input. I've already
asked Sebastian B. to take a look, as I don't think the issue is with the
parameter server.
haubitzer opened a new pull request #1121:
URL: https://github.com/apache/systemds/pull/1121
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
haubitzer closed pull request #1121:
URL: https://github.com/apache/systemds/pull/1121
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
Baunsgaard merged pull request #1116:
URL: https://github.com/apache/systemds/pull/1116
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
Baunsgaard opened a new pull request #1116:
URL: https://github.com/apache/systemds/pull/1116
This commit add the functionality to do unary aggregates on the compressed
overlapping matrices
This is an automated message from
Baunsgaard merged pull request #1107:
URL: https://github.com/apache/systemds/pull/1107
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
sebwrede commented on pull request #1113:
URL: https://github.com/apache/systemds/pull/1113#issuecomment-732130997
**Generally, I think this PR looks good.**
In AutomatedTestBase:640, you create a new PrivacyConstraint. This means
that it writes the privacy level to the metadata file.
Baunsgaard commented on pull request #1114:
URL: https://github.com/apache/systemds/pull/1114#issuecomment-732284324
Measurements before was wrong actual results are:
```code
cla , 3.1200
lcla, 0.4000
mkl , 62.3300
cla ,
Baunsgaard opened a new pull request #1114:
URL: https://github.com/apache/systemds/pull/1114
This commit adds relational support for relational operations,
(< > <= etc) in the compressed space, for overlapping matrices.
If the relational expression returns a constant matrix, the
Shafaq-Siddiqi opened a new pull request #1115:
URL: https://github.com/apache/systemds/pull/1115
Pipelines Optimizer and various minor built-ins
This commit contains,
1. Optimizer for cleaning pipelines
2. Minor built-ins imputeByMean, imputeByMedian, frameSort,
Baunsgaard opened a new pull request #1118:
URL: https://github.com/apache/systemds/pull/1118
This PR contains re-enabling parallel left multiplication for sparse
matrices, plus row based parallelization of dense.
Furthermore, it also contains optimization of Binary and scalar divide,
sebwrede opened a new pull request #1120:
URL: https://github.com/apache/systemds/pull/1120
Refactor of the handling of fine-grained privacy constraints in
PrivacyMonitor. This also removes some code not needed anymore.
Shafaq-Siddiqi closed pull request #1115:
URL: https://github.com/apache/systemds/pull/1115
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
Shafaq-Siddiqi closed pull request #1055:
URL: https://github.com/apache/systemds/pull/1055
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
mboehm7 commented on pull request #1113:
URL: https://github.com/apache/systemds/pull/1113#issuecomment-735300632
ok, this invalid data consolidation issue is now fixed in master - please
rebase.
This is an automated
Vulturemox opened a new pull request #1119:
URL: https://github.com/apache/systemds/pull/1119
**Work in Progress**
This is our first Design Document for the project of implementing DataWig in
SystemDS, we would appreciate feedback whether this approach is actually
reasonable.
Shafaq-Siddiqi commented on a change in pull request #1119:
URL: https://github.com/apache/systemds/pull/1119#discussion_r534075365
##
File path: scripts/staging/datawig/DesignDocument.md
##
@@ -0,0 +1,52 @@
+# DataWig Design Document
+Julian Rakuschek, Noah Ruhmer
+### Basic
Shafaq-Siddiqi commented on a change in pull request #1119:
URL: https://github.com/apache/systemds/pull/1119#discussion_r533299065
##
File path: scripts/staging/datawig/DesignDocument.md
##
@@ -0,0 +1,52 @@
+# DataWig Design Document
+Julian Rakuschek, Noah Ruhmer
+### Basic
Vulturemox commented on a change in pull request #1119:
URL: https://github.com/apache/systemds/pull/1119#discussion_r533320131
##
File path: scripts/staging/datawig/DesignDocument.md
##
@@ -0,0 +1,52 @@
+# DataWig Design Document
+Julian Rakuschek, Noah Ruhmer
+### Basic Idea
Shafaq-Siddiqi commented on a change in pull request #1119:
URL: https://github.com/apache/systemds/pull/1119#discussion_r533325224
##
File path: scripts/staging/datawig/DesignDocument.md
##
@@ -0,0 +1,52 @@
+# DataWig Design Document
+Julian Rakuschek, Noah Ruhmer
+### Basic
Vulturemox commented on a change in pull request #1119:
URL: https://github.com/apache/systemds/pull/1119#discussion_r533365636
##
File path: scripts/staging/datawig/DesignDocument.md
##
@@ -0,0 +1,52 @@
+# DataWig Design Document
+Julian Rakuschek, Noah Ruhmer
+### Basic Idea
Shafaq-Siddiqi commented on a change in pull request #1119:
URL: https://github.com/apache/systemds/pull/1119#discussion_r533287586
##
File path: scripts/staging/datawig/DesignDocument.md
##
@@ -0,0 +1,52 @@
+# DataWig Design Document
+Julian Rakuschek, Noah Ruhmer
+### Basic
Vulturemox commented on a change in pull request #1119:
URL: https://github.com/apache/systemds/pull/1119#discussion_r533320677
##
File path: scripts/staging/datawig/DesignDocument.md
##
@@ -0,0 +1,52 @@
+# DataWig Design Document
+Julian Rakuschek, Noah Ruhmer
+### Basic Idea
Baunsgaard closed pull request #1118:
URL: https://github.com/apache/systemds/pull/1118
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
Shafaq-Siddiqi commented on pull request #1125:
URL: https://github.com/apache/systemds/pull/1125#issuecomment-747368122
Hi,
I appreciate the design draft, it is a good effort. I would suggest doing
the mapping of Scikit-learn algorithms to DML and vice versa. Keep it simple
you only
Baunsgaard commented on pull request #1127:
URL: https://github.com/apache/systemds/pull/1127#issuecomment-747426098
> I'll have a look tonight and see what we can do. Airline was dense, right?
Yes airline is dense, and i don't seem to be able to reproduce the bad
performance
mboehm7 commented on pull request #1127:
URL: https://github.com/apache/systemds/pull/1127#issuecomment-747424448
I'll have a look tonight and see what we can do. Airline was dense, right?
This is an automated message from
mboehm7 commented on pull request #1126:
URL: https://github.com/apache/systemds/pull/1126#issuecomment-747405712
LGTM - thanks for the test @sebwrede. I now added explicit error handling
for inconsistent federated data characteristics, and fixed the test accordingly
(the underlying
Baunsgaard commented on pull request #1127:
URL: https://github.com/apache/systemds/pull/1127#issuecomment-747424955
The large 15 mil case seems to have little to no difference.
But there still is a bug somewhere.
XPS:
```bash
scripts/perftest/results/transpose-large.log
Baunsgaard edited a comment on pull request #1127:
URL: https://github.com/apache/systemds/pull/1127#issuecomment-747426098
> I'll have a look tonight and see what we can do. Airline was dense, right?
Yes airline is dense, and i don't seem to be able to reproduce the bad
performance
asfgit closed pull request #1126:
URL: https://github.com/apache/systemds/pull/1126
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
Baunsgaard opened a new pull request #1127:
URL: https://github.com/apache/systemds/pull/1127
This PR contains a simple addition to the micro benchmarks.
This time transpose of a matrix is measured.
3 basic cases:
"skinny" with 2.5mil rows 50 cols
"wide" with 50 cols and
mboehm7 commented on pull request #1125:
URL: https://github.com/apache/systemds/pull/1125#issuecomment-747535307
In general, that's a good starting point. We had another use case of
importing sk-learn pipelines in mind, but adding the sklearn-onnx-dml model
converter is also an
mboehm7 commented on pull request #1127:
URL: https://github.com/apache/systemds/pull/1127#issuecomment-747736880
ok, I just pushed some minor performance improvements for sparse-sparse
transpose operations which reduced the execution time of ten 2.5M x 50
(sparsity=0.1, seed=12)
Baunsgaard commented on pull request #1127:
URL: https://github.com/apache/systemds/pull/1127#issuecomment-748005848
When looking at before and after (the way i tested it was dropping the
transpose commit from the history.) it looks like i might have done something
wrong in the initial
mboehm7 commented on pull request #1123:
URL: https://github.com/apache/systemds/pull/1123#issuecomment-743759054
Thanks @Baunsgaard for eliminating the unnecessary colMeans in case of
center and scale. However, please refrain from unnecessary changes of APIs and
external behavior. I'll
Baunsgaard opened a new pull request #1124:
URL: https://github.com/apache/systemds/pull/1124
Adds a predict function for PCA and an inverse function.
The predict is for unseen data, that the PCA was not trained for just like
our other predict functions for other algorithms
The
Baunsgaard commented on pull request #1123:
URL: https://github.com/apache/systemds/pull/1123#issuecomment-743778523
> but you **wanted** to do this
ups, logic fine ... execution wrong. Great catch, thanks!
This is an
asfgit closed pull request #1123:
URL: https://github.com/apache/systemds/pull/1123
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
mboehm7 commented on pull request #1123:
URL: https://github.com/apache/systemds/pull/1123#issuecomment-743766561
ad 1) besides the changed overall behavior, the comment referred to
`replace(target=ScaleFactor, pattern=NaN, replacement=1e-16);`, which would
need to replace zero as this
Baunsgaard commented on a change in pull request #1124:
URL: https://github.com/apache/systemds/pull/1124#discussion_r541622750
##
File path: scripts/builtin/scale.dml
##
@@ -19,29 +19,48 @@
#
#-
-# Scale and
Baunsgaard commented on pull request #1123:
URL: https://github.com/apache/systemds/pull/1123#issuecomment-743764247
1. I changed the replace Nan because the NaN would be introduced in cases of
division by zero. therefore it made sense to change the replacement on the
scale factor. This
mboehm7 commented on pull request #1123:
URL: https://github.com/apache/systemds/pull/1123#issuecomment-743777089
ad 1) there is a mismatch between what you wanted to do and what your code
actual did, the comment just pointed that out. The PR did this
`replace(target=ScaleFactor,
Baunsgaard commented on pull request #1123:
URL: https://github.com/apache/systemds/pull/1123#issuecomment-743772534
> ad 1) besides the changed overall behavior, the comment referred to
`replace(target=ScaleFactor, pattern=NaN, replacement=1e-16);`, which would
need to replace zero as
Baunsgaard opened a new pull request #1122:
URL: https://github.com/apache/systemds/pull/1122
This commit contains various changes
1. Compressed Sparse matrix multiplication
2. modified matrix multiplication to push down information of
transposing to the ba+* op. to allow
ywcb00 opened a new pull request #1133:
URL: https://github.com/apache/systemds/pull/1133
This is a PR for adding WCeMM as a first federated quaternary operation.
The PR contains the implementations for parsing and processing the
instruction, as well as tests to test the instruction.
phaniarnab opened a new pull request #1134:
URL: https://github.com/apache/systemds/pull/1134
PR to run tests.
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
tobiasrieger closed pull request #1113:
URL: https://github.com/apache/systemds/pull/1113
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
tobiasrieger opened a new pull request #1131:
URL: https://github.com/apache/systemds/pull/1131
This PR includes the closed PR #1113 and all changes proposed in its
comments. It was rebased on master and consolidated to make it easier to merge
Changes list:
- Added four new
501 - 600 of 3467 matches
Mail list logo