Hi, I have sent a pull request for this issue. As a next step, could you suggest any new issue? or anything I have to do to familiarize with Language and run time for parameter servers project.
And regarding writing the project proposal I have few questions. * In the epic there are few sub tasks, is it enough to focus on a single task through out the summer? Would it have enough work load or should I go for multiple tasks? * What is the linkage between sub tasks? Do tasks like, Distributed Spark Back-end or Local multi threaded back ends; need previous tasks completed before starting work? I am glad if you could suggests some issues related to Distributed spark back-end or multi threaded backend tasks. Thanks. Regards, Chamath On Fri, Mar 2, 2018 at 6:46 AM, Matthias Boehm <mboe...@gmail.com> wrote: > Hi Chamath, > > in general, you're absolutely right - you can enable -stats and > programmatically probe the heavy hitter statistics for certain opcodes. > However, uamin and uamax stand for "unary aggregate minimum" and "unary > aggregation maximum" which correspond to min(X) and max(X) on script level. > Instead all generated fused operators are prefixed with spoof or sp_spoof > (for distributed spark operations). The related junit assertion should > already be in the existing tests, I just mentioned it for completeness. > > Regards, > Matthias > > On Thu, Mar 1, 2018 at 4:30 AM, Chamath Abeysinghe < > abeysinghecham...@gmail.com> wrote: > >> Thanks for your detailed reply. >> I did some coding >> <https://github.com/apache/systemml/compare/master...chamathabeysinghe:SYSTEMML-2159?diff=split&name=SYSTEMML-2159> >> [1] for this issue SYSTEMML-2159 to extend test cases for FA & FNR . I got >> a problem regarding success criteria's, "generating at least one fused >> operator" condition, I think this means I have to look into stats of Heavy >> hitter instructions and check if there are any fused operators. (my guess >> is uamin and uamax are the operators what I have to look for, but I am not >> sure about this because I don't know the meaning of these instructions). >> >> Please help me to clarify this. If my approach is correct I could send a >> PR after fixing tests for other algorithms. Thanks. >> >> Regards, >> Chamath >> >> [1] https://github.com/apache/systemml/compare/master...cham >> athabeysinghe:SYSTEMML-2159?diff=split&name=SYSTEMML-2159 >> >> >> On Tue, Feb 27, 2018 at 1:54 AM, Matthias Boehm <mboe...@gmail.com> >> wrote: >> >>> ---------- Forwarded message ---------- >>> From: Matthias Boehm <mboe...@gmail.com> >>> Date: Mon, Feb 26, 2018 at 11:59 AM >>> Subject: Re: Extending Codegen algorithm tests for heuristics >>> To: Chamath Abeysinghe <abeysinghecham...@gmail.com> >>> >>> >>> great - thanks for taking this over Chamath. >>> >>> In general, I would recommend to use this task to explore SystemML a >>> little. For example, take one of the codegen algorithm tests from >>> org.apache.sysml.test.integration.functions.codegenalg (e.g., >>> AlgorithmL2SVM) and pass different flags such as -stats, -explain, >>> -explain >>> recompile_hops, -explain recompile_runtime to programArgs and try to >>> understand the output. If you come over specific questions, please just >>> ask. >>> >>> To answer your detailed questions: >>> >>> 1) We recently added a code generation framework that automatically >>> identifies opportunities for fused operators and subsequently generates >>> code for these operators. A major part is the selection of fusion plans, >>> for which we provide heuristics and a cost-based optimizer. By default >>> (and >>> thus also in our testsuite), we use the cost-based optimizer, but it >>> would >>> be good regularly test the heuristics as well. >>> >>> 2) You can configure the used optimizer in your SystemML-config.xml file >>> as >>> follows: >>> <sysml.codegen.optimizer>fuse_all</sysml.codegen.optimizer> >>> Valid alternatives are: fuse_all, fuse_no_redundancy, fuse_cost_based, >>> and >>> fuse_cost_based_v2 (default). You can provide alternative config xml >>> files >>> and switch them dynamically via getConfigTemplateFile. >>> >>> 3) Similar to the existing tests, it needs to (1) run without errors, (2) >>> produce correct results as compared to R, and (3) generate at least one >>> fused operator. >>> >>> Regards, >>> Matthias >>> >>> On Mon, Feb 26, 2018 at 6:54 AM, Chamath Abeysinghe < >>> abeysinghecham...@gmail.com> wrote: >>> >>> > Hi All, >>> > As per the guidelines given to GSoC students, I would like to work on >>> the >>> > SYSTEMML-2159 [1] issue as a starting point. But I don't understand the >>> > background of the issue. Can someone help me with understanding the >>> context >>> > of this issue? >>> > >>> > Few problems I got are, >>> > >>> > 1) What are fusion heuristics, fuse-all and fuse-no-redundancy? >>> > 2) Can I pass those heuristic related configurations as args to execute >>> > DMLScript? >>> > 3) What is the success criteria for a test that use those heuristics? >>> > >>> > Thank you in advance >>> > >>> > Regards, >>> > Chamath >>> > >>> > [1] https://issues.apache.org/jira/browse/SYSTEMML-2159 >>> > >>> > -- >>> > Chamath Abeysinghe >>> > Department of Computer Science and Engineering >>> > University of Moratuwa >>> > <https://www.facebook.com/chamath.abeysinghe.3> [image: >>> > https://www.linkedin.com/in/kaushalya-gayan-batawala-bbb5927 >>> a?trk=hp-identity-name] >>> > <https://lk.linkedin.com/in/chamathabeysinghe> >>> > Mobile : +94752930548 >>> > >>> >> >> >> >> -- >> Chamath Abeysinghe >> Department of Computer Science and Engineering >> University of Moratuwa >> <https://www.facebook.com/chamath.abeysinghe.3> [image: >> https://www.linkedin.com/in/kaushalya-gayan-batawala-bbb5927a?trk=hp-identity-name] >> <https://lk.linkedin.com/in/chamathabeysinghe> >> Mobile : +94752930548 >> > > -- Chamath Abeysinghe Department of Computer Science and Engineering University of Moratuwa Mobile: +94712803295