Hi Josh.
We are interested in models for distributed computing that can
easily exist in very heterogeneous environments such as high
performance computers/web service servers/desktop PCs down to
phones and other specialized network devices with low levels of
resources, but interestingly also lowest latency with regard to
the so called 'user'.
We are currently developing the Fhat processor in Java and for
various triple-store interfaces. What this means is that nothing will
run faster than Java and on top of that, it will always be
constrained by the read/write speed of the triple-store. For triple-
stores like AllegroGraph, this is fast, for stores like Kowari..... i
dunno---we will see when we benchmark the prototype ... This is not a
high-performance computing paradigm as much as its a distributed
computing (internet computing) paradigm. Let me point you to this
article by some Carnegie-Mellon people that is related in thought:
http://isr.cmu.edu/doc/isr-ieee-march-2007.pdf
Would this new language make managing data and process on multiple
computers easier to program for in a more general sense? How do we
make a network based computer that gets us away from having to
worry about where a particular data set is -- or where a
particular process is running? I know this is focused on the
semantic web but can this help me deal with manageing my many
overlapping data streams that I want available on any computer I
come in contact with -- such as model output or more importantly
digital photos, mp3s, and videos?
If you do represent your data in RDF (which could be a resolvable URI
to some byte-stream---e.g. music, movies, images, etc.) then your
data is always present/accessible on the "Semantic Web" (in a triple-
store repository or pointed to by a URI in the RDF network that can
be resolved to a "physical" file). Furthermore, the execution of that
data is also on the Semantic Web (the RVM state is represented in
RDF). Lets say you want to move to computer B but you have something
executing on your current computer A. Well you just halt the RVM and
its stored frozen in the Semantic Web (you can halt at the
instruction level--meaning, in mid-method...). Then you just move to
computer B and start the RVM process up again. It continues at the
last instruction you halted it at. To the RVM time didn't stop. You
can move to different computers and always have the same applications
running where you left off---no sleeping or shutting down.
Again, your data, your computing machine (RVM), and your software
(triple-code) is all in the Semantic Web. It doesn't matter which
hardware device you are running the RVM process on (as a long as the
RVM process code is written for that machine---thats why we are
building the Fhat process code in Java). Also, check this out. Assume
that your hardware CPU is VERY slow (lets say a mobile device). Well,
you need not use the mobile device's CPU to execute the RVM process.
You can leverage another CPU in the pool of Semantic Web hardware
devices to execute the code while the state changes are read by your
mobile device. Your mobil device is only an I/O device, not a "number
cruncher". You can have your home computer doing all the RVM process
code while your mobile device controls that flow of execution. Your
mobile device leverages the computer power of the desktop machine.
However, there is a great price to pay for all this. Because
EVERYTHING is represented internal to the triple-store, there is a
great deal of read/writes. The triple-store is the bottle-neck. While
you can federate triple-stores (which is blind to the end
applications), this is still the limiting factor. However, we are not
only developing Fhat, but r-Fhat (reduced Fhat) which is an RVM whose
state is not represented in RDF. This does not provide the nice
portability seen with Fhat, but does greatly reduce the number of
read/writes to the triple-store. For this reason, I wouldn't pose
this as a "high-performance" computing paradigm.
(I haven't thought much about multi-threading where you can have
multiple RVM processes executing a single RDF program, but I know its
possible and will write something up about it as the logistics of it
solidify in my mind... In such cases you would want your RDF software
distributed across multiple triple-stores so as to avoid the read/
write bottle neck.)
Finally, because everything in the Semantic Web is a URI (for which a
URL is a sub-class of), the software you write it at the world stage.
This gets into the whole Semantic Web Services concept. You can have
instantiated objects or APIs (objects that need instantiating) that
just exist and can be leveraged by your software. There is no
downloading of .jars and classpathing stuff. Its all in one large
software repository called the Semantic Web.
Hope this clears things up... Please ask more questions. It helps me
to clear up my thoughts (again, this is all very new to me too :).
Take care,
Marko.
I think a wed-tech talk would be very welcome.
--joshua
---
Joshua Thorp
Redfish Group
624 Agua Fria, Santa Fe, NM
On Apr 26, 2007, at 8:01 AM, Marko A. Rodriguez wrote:
LANL is currently building a compiler and virtual machine that is
compliant with the specification in the paper. If RedFish is
interested, perhaps in a month or two, I could demo this computing
paradigm at a Wednesday tech session.
============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
lectures, archives, unsubscribe, maps at http://www.friam.org
Marko A. Rodriguez
Los Alamos National Laboratory (P362-proto)
Los Alamos, NM 87545
Phone +1 505 606 1691
http://www.soe.ucsc.edu/~okram
============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
lectures, archives, unsubscribe, maps at http://www.friam.org