[ https://issues.apache.org/jira/browse/SYSTEMML-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

## Advertising

Matthias Boehm updated SYSTEMML-2169: ------------------------------------- Description: The introduction of nary cbind and rbinds in SYSTEMML-1986 added support for operations like {{E = cbind(A,B,C,D)}} which concatenates the matrices A, B, C, D column-wise without the need for intermediates as requires by traditional binary cbind operations ({{cbind(cbind(cbind(A,B),C),D)}}). SystemML also provides rewrites to automatically collapse chains of cbind or rbind operations into their nary counter-parts. However, for distributed spark operations, the binary cbind is still much better optimized than the nary operation, which only provides a general case operation based on repartition joins. This tasks aims to address this by extending {{BuiltinNarySPInstruction}} at runtime level (i.e., within {{processInstruction}}). Given the unlimited number of inputs, this runtime approach seems more appropriate than dedicated physical operations at compiler level. In detail, we need to evaluate if a subset of input fits into the broadcast budget, and if so provide alternative code path for nary cbind/rbind operations with broadcast joins. Note that distributed codegen operations have a similar characteristics of unlimited inputs and already leverage broadcast variables when possible. Hence, we can probably use a similar approach as done in {{SpoofSPInstruction}}. was: The introduction of nary cbind and rbinds in SYSTEMML-1986 added support for operations like {{E = cbind(A,B,C,D)}} which concatenates the matrices A, B, C, D column-wise without the need for intermediates as requires by traditional binary cbind operations ({{cbind(cbind(cbind(A,B),C),D)}}). SystemML also provides rewrites to automatically collapse chains of cbind or rbind operations into their nary counter-parts. However, for distributed spark operations, the binary cbind is still much better optimized than the nary operation, which only provides a general case operation based on repartition joins. This tasks aims to address this by extending {{BuiltinNarySPInstruction}} at runtime level. Given the unlimited number of inputs, this runtime approach seems more appropriate than dedicated physical operations at compiler level. In detail, we need to evaluate if a subset of input fits into the broadcast budget, and if so provide alternative code path for nary cbind/rbind operations with broadcast joins. Note that distributed codegen operations have a similar characteristics of unlimited inputs and already leverage broadcast variables when possible. Hence, we can probably use a similar approach as done in {{SpoofSPInstruction}}. > Spark nary cbind/rbind with broadcasts > -------------------------------------- > > Key: SYSTEMML-2169 > URL: https://issues.apache.org/jira/browse/SYSTEMML-2169 > Project: SystemML > Issue Type: Task > Reporter: Matthias Boehm > Priority: Major > Labels: beginner > > The introduction of nary cbind and rbinds in SYSTEMML-1986 added support for > operations like {{E = cbind(A,B,C,D)}} which concatenates the matrices A, B, > C, D column-wise without the need for intermediates as requires by > traditional binary cbind operations ({{cbind(cbind(cbind(A,B),C),D)}}). > SystemML also provides rewrites to automatically collapse chains of cbind or > rbind operations into their nary counter-parts. > However, for distributed spark operations, the binary cbind is still much > better optimized than the nary operation, which only provides a general case > operation based on repartition joins. > This tasks aims to address this by extending {{BuiltinNarySPInstruction}} at > runtime level (i.e., within {{processInstruction}}). Given the unlimited > number of inputs, this runtime approach seems more appropriate than dedicated > physical operations at compiler level. In detail, we need to evaluate if a > subset of input fits into the broadcast budget, and if so provide alternative > code path for nary cbind/rbind operations with broadcast joins. > Note that distributed codegen operations have a similar characteristics of > unlimited inputs and already leverage broadcast variables when possible. > Hence, we can probably use a similar approach as done in > {{SpoofSPInstruction}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)