ilovemesomeramen commented on a change in pull request #1261:
URL: https://github.com/apache/systemds/pull/1261#discussion_r631392605



##########
File path: 
src/main/java/org/apache/sysds/runtime/transform/encode/MultiColumnEncoder.java
##########
@@ -72,11 +96,60 @@ public MatrixBlock encode(FrameBlock in) {
        }
 
        public void build(FrameBlock in) {
-               for(ColumnEncoder columnEncoder : _columnEncoders)
-                       columnEncoder.build(in);
+               build(in, 1);
+       }
+
+       public void build(FrameBlock in, int k) {
+               if(MULTI_THREADED && k > 1) {
+                       buildMT(in, k);
+               }
+               else {
+                       for(ColumnEncoder columnEncoder : _columnEncoders)
+                               columnEncoder.build(in);
+               }
                legacyBuild(in);
        }
 
+       private void buildMT(FrameBlock in, int k) {
+               int blockSize = BUILD_BLOCKSIZE <= 0 ? in.getNumRows() : 
BUILD_BLOCKSIZE;
+               List<Callable<Integer>> tasks = new ArrayList<>();
+               ExecutorService pool = CommonThreadPool.get(k);
+               try {
+                       if(blockSize != in.getNumRows()) {
+                               // Partial builds and merges
+                               List<List<Future<Object>>> partials = new 
ArrayList<>();
+                               for(ColumnEncoderComposite encoder : 
_columnEncoders) {
+                                       List<Callable<Object>> 
partialBuildTasks = encoder.getPartialBuildTasks(in, blockSize);
+                                       if(partialBuildTasks == null) {
+                                               partials.add(null);
+                                               continue;
+                                       }
+                                       
partials.add(pool.invokeAll(partialBuildTasks));
+                               }
+                               for(int e = 0; e < _columnEncoders.size(); e++) 
{
+                                       List<Future<Object>> partial = 
partials.get(e);
+                                       if(partial == null)
+                                               continue;
+                                       tasks.add(new 
ColumnMergeBuildPartialTask(_columnEncoders.get(e), partial));
+                               }

Review comment:
       Since this PR i did a lot more testing and this partial building is 
rather complicated, especially since a ton of intermediates are being created 
increasing GC. At the moment partial building is not really viable in most 
scenario. This will be good to discuss on Friday. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to