On 05/17/2014 10:34 AM, Lux, Jim (337C) wrote:




On 5/13/14, 4:48 PM, "Ellis H. Wilson III" <[email protected]> wrote:


Wrapping this back into the original issue (next-gen HPC languages), I
think the core issue is non-programmers programming.  <begin
generalization>  They don't really want to program.  They're doing it as
a means to an end.  They'd be more than happy to write equations in lieu
of routines, as the article alludes to. <end generalization>

Actually, I think this is the core thing.  For most people, they are
interested in doing their job, not programming, whether they are just
typing a book report or doing a full scale simulation of the earth¹s
atmosphere.  The programming is a means to an end.

This is true, but this is the wrong attitude, and I think it's a result of both the educational system, and extreme that the 'publish or perish' academic paradigm has evolved into. It's also a lesson in false economies.

Why the educational system? Well, the older scientists I've worked with know the minutiae of all aspects of computing - programming languages, processor designs, even assembly. Since this is pretty common with scientists of a certain age, I assume that in years past learning this computer science went hand in hand with learning their science ( physics, etc.) The younger scientists I work with all seem to have lousy programming skills, no respect for the the details of computing, prefer to work in MATLAB (or similar), and if they run into a problem, show little interest or know-how in getting to the root of it.

I wouldn't call the above statements generalizations as much as trends I've noticed over the past 10-15 years. There could be other causes, but I think the main reason for this is that schools are no longer putting enough emphasis on understanding computers as a tool of research. Now that we things like MATLAB and Mathematica, why waste precious credit-hours teaching them how to program?

I know this to be true based on my curriculum as a Chemical Engineering student 20 years ago compared to the curriculum at the same school today. When I was a student, all engineers had to take Fortran for Engineering their freshman year. My numerical methods class, which was a Chemical Engineering class taught by a Chem Eng professor, taught us how to do matrix operations and ODE integrators line-by-line. I'm pretty sure they don't teach Fortran to engineers any more, and I know that the numerical methods class is based on MATLAB now. I wouldn't be surprised if that numerical methods class is completely eliminated in the next few years.

I think this is wrong. If you are carpenter, your main goal is to build a house, but I'm sure you are still taught how to use every tool in your toolbox safely and effectively in vocational school. Who would hire a carpenter who was never taught how to cut crown moulding properly with a mitre saw? Hiring a scientist or engineer who doesn't know how to use computers effectively is the same thing.

I find that in the current academic research environment, the SOP in many places is to crank out crappy code, get results as quickly as possible publish them, and move on to the next paper,all with the goal of cranking out as many papers as possible in order to get as many grants as possible. I equate this to be like a restaurant where they try to increase revenue by rushing customers through their meal so they turn-over as many tables as possible in an evening, with no concern for the dining experience.

I say this is a false-economy because the real goal is time to solution, not time to code completion, but I've seen many situations where people write code quickly, and then their simulation runs for a month. Another person comes a long, makes some minor changes, and the simulation now completes in less than a day. Now multiply that time difference by multiple simulations... If they spent a little more time learning about the art of coding, or just learning some 'best practices', and spending a few more days of coding, they could save themselves weeks or months of time waiting for results. Or even years over an entire career, but few ever see it that way.






Therefore,
maybe, instead of continuing to attempt to create the "perfect language"
that fits their needs,

The challenge is that there are so many problem domains that what you
really need is a custom language tailored to each of them.  And isn¹t that
what we have with large subroutine libraries and what not? Someone who is
stringing together calls to library routines is basically programming in a
domain specific language (with a strange hybrid of the syntax and
semantics of the underlying implementation language, rather than something
that is domain relevant).

Or, what you see is domain specific pre and post processors for the
underlying numerical computations.  For the Numerical Electromagnetics
Code there¹s dozens of preprocessors and post processors ranging from
Excel spreadsheets and macros to dedicated graphics editors and
sophisticated plotting programs (since the underlying code is really
looking for 80 column input records and generates 132 column output
files).  But those pre-post processors are sort of narrow, and don¹t
really rise to a ³programming language² in that they have a fairly
simplistic architectural model.  They provide some basic iteration and
computation syntax (e.g. One can systematically change the length of an
element of the model and get a summary of the output), but it¹s not like
you can actually do ³programming²  You couldn¹t write a customized
optimizer using the pre, post processor capabilities.

The same is true in structural analysis and in computational flow and, I¹m
sure, although I have no experience in it, with computational chemistry.
Anyone who is doing lots of this kind of thing has the basic validated
simulation codes and a huge toolbox of modeling and analysis stuff.  Maybe
it¹s programs that take a solid model and automatically generating the FEM
grid. Maybe it¹s a program or routine that takes the raw analysis output
and generates output in a particularly useful format (domain specific).

People optimizing race cars do not literally re-invent the wheel model
each time they do a new simulation and analysis.


maybe the better solution is to teach them the
tenets of proper programming so they can grasp the process and instruct
them on ways to write very clean and elegant design documents.  Sure, in
some cases that may take as long just to get the design doc done as it
would for them to just code it, but in the long run if said code gets
wrapped into a larger project (or grows into one) it will result in far
less maintenance and complexity than having 10 physicists and 10 CS
folks both playing with the code simultaneously.
Never going to happen: a lot of scientific computation is done by
incremental development without a clear picture of where the end point is.
  You write some Matlab code to analyze some raw data you collected.  Hmm,
that looks interesting, so you graft on another step of processing.  That
looks better, but, hey, this aspect is interesting now, so you write some
more code to do the processing needed.

This is very true.

Rarely does someone start out with a clean sheet and say ³I¹m going to
write a numerical simulation of the weather², because that would be a
herculean (and expensive) task.  Particularly in what I¹ll call scientific
computation, the government funded development process is characterized by
receiving relatively small amounts of budget over many, many years.  If
you go to the NSF site for instance, and look at the several dozen awards
for climate and large scale dynamics, you¹ll see that they¹re pretty much
all in the ³few $100k² range.  Those PIs receiving the funds are
interested more in the science than in the software engineering (it is the
National *Science* Foundation, after all)

It is possible that there is significant commercial development of these
sorts of models (almost certainly the case in the biotech field) and I
would imagine that they DO actually use better design processes.

And for something like NASTRAN, where there is a clearly identified large
scale need, it could get funded with a larger chunk of change, and
hopefully use decent software engineering.



James Lux, P.E.
Task Manager, FINDER ­ Finding Individuals for Disaster and Emergency
Response
Co-Principal Investigator, SCaN Testbed (née CoNNeCT) Project
Jet Propulsion Laboratory
4800 Oak Grove Drive, MS 161-213
Pasadena CA 91109
+1(818)354-2075
+1(818)395-2714 (cell)


_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


--
Prentice

_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to