phaniarnab commented on code in PR #1828:
URL: https://github.com/apache/systemds/pull/1828#discussion_r1205726366


##########
src/main/java/org/apache/sysds/runtime/transform/encode/MultiColumnEncoder.java:
##########
@@ -314,11 +314,12 @@ public MatrixBlock apply(CacheBlock<?> in) {
        public MatrixBlock apply(CacheBlock<?> in, int k) {
                // domain sizes are not updated if called from transformapply
                boolean hasUDF = _columnEncoders.stream().anyMatch(e -> 
e.hasEncoder(ColumnEncoderUDF.class));
+               boolean hasWE = _columnEncoders.stream().anyMatch(e -> 
e.hasEncoder(ColumnEncoderWordEmbedding.class));
                for(ColumnEncoderComposite columnEncoder : _columnEncoders)
                        columnEncoder.updateAllDCEncoders();
                int numCols = getNumOutCols();
                long estNNz = (long) in.getNumRows() * (hasUDF ? numCols : 
(long) in.getNumColumns());
-               boolean sparse = 
MatrixBlock.evalSparseFormatInMemory(in.getNumRows(), numCols, estNNz) && 
!hasUDF;
+               boolean sparse = 
MatrixBlock.evalSparseFormatInMemory(in.getNumRows(), numCols, estNNz) && 
!hasUDF && !hasWE;

Review Comment:
   Yes. For UDF encoders we forced dense as we cannot derive the sparsity of 
the outputs. But for embeddings, you already know the number of nonzeros.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to