[
https://issues.apache.org/jira/browse/SYSTEMML-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042275#comment-16042275
]
Matthias Boehm commented on SYSTEMML-1662:
------------------------------------------
great [~dhutchis] - let me clarify these one by one.
* DataOp: A DataOp can be either a persistent read/write or transient
read/write - writes will always have at least one input, but all types can have
parameters (e.g., for csv literals of delimiter, header, etc)
* DataGenOp: A DataGenOp can be rand (or matrix constructor), sequence, and
sample - these operators have different parameters and use a map of parameter
type to hop position; it would be good to check at least consistency between
the number of map entries and number of inputs.
* ReorgOp: In general, these operators have one input (e.g., transpose, diag,
rev), but there are certain operators - specifically, sort (i.e., order), and
reshape - which take additional parameters such as the order by column and
target dimensions.
* TernaryOp: Yes, generally, these operators have three operands, with the
exception of ctable, which takes target dimensions (for padding and pruning).
* QuaternaryOp: Similarly, QuaternaryOps can have three or four inputs.
* SpoofFusedOp: Yes, in general, fused operators can have one or more inputs.
However, specific types of fused operators have specific constraints - for
example the OuterProduct template type will always have at least three inputs
(sparse driver, two matrices for outer product like matrix multiply).
So maybe it's a good idea to create a method that returns if the number of
inputs is correct instead of the expected number of inputs - this way, we can
easily check for consistency within each hop in an operation-specific manner.
> Extended HOP DAG validator
> --------------------------
>
> Key: SYSTEMML-1662
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1662
> Project: SystemML
> Issue Type: Sub-task
> Components: Compiler
> Reporter: Matthias Boehm
> Labels: beginner
> Fix For: SystemML 1.0
>
>
> This task aims to extend the existing HOP DAG validator (see
> {{org.apache.sysml.hops.rewrite.HopDagValidator}}, which can be enabled via
> {{org.apache.sysml.hops.rewriteProgramRewriter.CHECK}}) in various ways in
> order to provide better developer tooling for checking the correctness of new
> and existing rewrites.
> So far, this validator, checks only for:
> * Correct parent node linking
> * Correct child node linking
> * Non-empty children (for all hops other than {{DataOp}} and {{LiteralOp}})
> Possible extensions include (but are not limited to):
> * Correct HOP output data types
> * Correct HOP output value types
> * Correct number of expected child nodes
> * Correct output size information wrt input sizes
> * Correct visit status
> These extensions would be very useful for multiple reasons. First, they would
> detect rewrite issues early on in the development process. This is important
> because rewrite issues usually lead to strange and non-obvious behavior of
> real application scripts. Second, the HOP DAG validator provides a systematic
> way of debugging optimizer issues. The intended future workflow is as follows:
> * 1. Disable rewrites via optimization level 1 to determine if rewrites are
> the issue.
> * 2. Use the extended {{HopDagValidator}} validator to find the source of
> corruption.
> * 3. If (2) did not find the issue, resort to low-level debugging, and extend
> the {{HopDagValidator}} to capture the root cause of the issue.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)