Folks, we're testing out a basic scripting language and interpreter for report writing and quality control that is meant to provide both command-line programmers and Galaxy platform researchers and admins with ways to tweak workflow quality control behaviour without having to be programmers themselves.
I wanted to sound-out the community about whether or not there are any basic objections to our approach, described briefly below? Currently the interpreter provides all the built-in Python operator and math functions (as well as some particular named group regular expression functions for text mining) so that users can examine given input log or data datasets for fields that need to be reported or compared with QC metric rules. The Report Calc tool takes in a file of statements (each being a function(parameter1 parameter2 ...) syntax) that look like this: set(4857000 report/contigs/reference_genome_size) set("serovar Typhimurium LT2" report/contigs/reference_genome) set(0.1 report/contigs/good_genome_size_ratio) set( statisticN(report/contigs/contig_lengths 50) report/contigs/N50) if( lt(/N50 200000) set(report/job/status FAIL)) Math is accomplished by python built-in math functions (I.e. Ignore the "/" - that's a namespace syntax character). set( truediv( abs( sub( /sampleGenomeSize /referenceGenomeSize)) /referenceGenomeSize) report/contigs/sample_genome_size_ratio ) And writes any output as desired to a standard tool output folder: writeFile( pageHtml( getHtml(report) "My Report Widget") report.html ) It allows users to build a ruleset (file containing statements like the above), and process text, json, tabular datasets in their history, and it can manipulate variables, arrays and dictionaries in the tools in-memory temporary data structure namespace. It doesn't touch Galaxy's inner workings or interact with a workflow except by way of app exit codes. One can even write little text-mining programs in it: iterate( readFileByName(contigs-all.fasta) if( eq( getitem(iterator/0/value 0) ">") append( regexp(iterator/0/value "length_(?P<value>\\d+)_") report/contigs/contig_lengths ) ) ) I'll release it shortly for review/play, with lots of documentation that describes in detail what it can/can't do, and an argument for why one might want to bother using/learning it. There are one or two irrelevant python built-in functions we might filter out (e.g. isCallable() ); so far we haven't spotted any security issues, and we've limited the flow control to only accomplish loops by iterables so there's no evident way to create infinite loops. One only iterates through files or in-memory arrays. As well suggested functionality for the wish-list now accepted! Regards, Damion ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/