Re: [ccp4bb] Off-topic: Best Scripting Language

George Sheldrick Wed, 12 Sep 2012 12:22:45 -0700

It is the lack of compatibility between different versions mentioned byEthan that really put me off learning PYTHON. In contrast, theFORTRAN-66 program SHELX76 still compiles and runs correctly with anymodern FORTRAN compiler. The only significant 'new' features that I nowuse are dynamic array allocation (introduced in FORTRAN-90) and OpenMPsupport for multiple CPUs, but even programs using OpenMP would stillwork with older compilers because the OpenMP instructions would betreated as comments.


George


On 09/12/2012 08:28 PM, Ethan Merritt wrote:

On Wednesday, September 12, 2012 09:52:09 am Jacob Keller wrote:

For the specific purpose you list -
input from tab-delimited data
output to simple statisitical summaries and (I assume) plots
- it sounds like gnuplot could do the job nicely.

I wasn't aware that gnuplot can do calculations--can it? I was probably
going to use it somewhere as a plotting option.

Here's a simple-minded example using a dump of the current contents
of the PDB from www.pdb.org as a comma-separated file with ~65000 entries.
The input file was previously filtered to contain only X-ray structures
between 1 and 4 Angstroms resolution.

gnuplot>  !head -3 PDB.csv
PDB ID,R Observed,R All,R Work,R Free,Refinement Resolution
"100D","0.145","","0.145","","1.90"
"101D","0.163","","","0.252","2.25"

gnuplot>  set datafile separater ","
gnuplot>  set datafile nofpe_trap   # trap handling greatly slows large data 
sets
gnuplot>  stats 'PDB.csv' using "R Observed" prefix "Robs"

* FILE:
   Records:      63029
   Out of range:     0
   Invalid:          0
   Blank:            2
   Data Blocks:      2

* COLUMN:
   Mean:          0.1982
   Std Dev:       0.0334
   Sum:       12494.6900
   Sum Sq.:    2547.3068

   Minimum:       0.0450 [24518]
   Maximum:       0.9700 [45024]
   Quartile:      0.1770
   Median:        0.1970
   Quartile:      0.2180

gnuplot>  print Robs_mean
          0.198237160672072

gnuplot>  #calculate correlation of Robs with Resolution
gnuplot>  stats 'PDB.cvs' using "R Observed":"Refinement Resolution"  nooutput
gnuplot>  print STATS_correlation
          0.595763711910418

I've attached graphical output of the same data following some sorting,
filtered, binning, etc, with output to a PDF file.

You can do all this in R also.   R has a larger collection of statistics 
options,
but is not as good at dealing with really large data sets.  IMHO gnuplot has 
more
flexible options for graphical output.

Otherwise I'd recommend perl, and dis-recommend python.


Why are you dis-ing python? Seems everybody loves it...

I'm sure you can google for many "reasons I hate Python" lists.

Mine would start
1) sensitive to white space == fail
2) dynamic typing makes it nearly impossible to verify program correctness,
    and very hard to debug problems that arise from unexpected input or
    a mismatch between caller and callee.
3) the language developers don't care about backward compatibility;
    it seems version 2.n+1 always breaks code written for version 2.n,
    and let's not even talk about version 3
4) sloooow unless you use it simply as a wrapper for C++,
    in which case why not just use C++ or C to begin with?
5) not thread-safe

     you did ask...
                
                Ethan



--
Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-22582

Re: [ccp4bb] Off-topic: Best Scripting Language

Reply via email to