Re: AW: Apache Tuscany doubts

Raymond Feng Fri, 27 Jun 2008 08:38:46 -0700

Thanks Simon for the quick and nice response. Just want to clarify a bit onthe data transfer: In Tuscany SCA, the data can be in any format, XML orbinary. What matters is the data formats that the communication protocol(binding) can handle. We also have a databinding framework to enabletransparent data transformation across formats.


For example, you can model the data transfer service as:


DataTransfer
   byte[] receiveData(...);
or
   InputStream receiveData();
or
   Image receiveData();

Thanks,
Raymond

--------------------------------------------------
From: "Simon Laws" <[EMAIL PROTECTED]>
Sent: Friday, June 27, 2008 6:18 AM
To: <[email protected]>
Subject: Re: AW: Apache Tuscany doubts

On Fri, Jun 27, 2008 at 1:56 AM, Malte Marquarding <
[EMAIL PROTECTED]> wrote:
Hi Raymond,

The system we envisage is roughly as follows.

We are running a set of radio telescopes at remote site. The high level
exposure of the various parts of system should be via service components.
The actual implementation is specific to the problem.We have thltelescopecontrol (of the hardware). The data generated form these telescopes hasto
be processed on a supercomputer ( hence implementation in c++), through a
set of services like Calibration, Imaging, Analysis. It also has some
control queue (running the end-to-end process, referencing the various
services), archiving, logging, user access (virtual observatory)components
etc.

I am investigating tuscany for this.
Another unrelated problem I have is that I can't see (with SDOs andSCA )how to handle the transfer of the data. The data output of the c++services
is tens to hundreds of Terabytes. I was thinking of having a DataMoving
service encapsulating something like GridFTP. Has anyone got a suggestion
of
how to handle this in an SCA context.

I can give more details if necessary.

Cheers,
Malte.
On Fri, Jun 27, 2008 at 4:16 AM, Raymond Feng <[EMAIL PROTECTED]>wrote:
> Hi,
>
> Can you describe the use cases you have in mind? It will help us better
> understand what you want to achieve.
>
>
Hi Malte

The bindings we have implemented to date are intended to operate in the
typical SOA environment where you pass data to a component and ask it todosomething. We tend to talk in terms of XML documents which will be finefor
the control messages you need but not suitable for the telescope data
itself.
To try and understand the subtleties of your scenario I'm going to takethecomponents you suggested and invent some operations that we might expectto
find there...

TelescopeControl
 PerformObservation(ObservatonParameters, ObservationId)
// I assume you give the telescope a job to do, i.e point at the skyand
record the results against a given ID
   // and then callback when the task is complete
DataTransfer
 Transfer(FromLocation, ToLocation)
// just manages the task of moving large datasets across the network.As
you say gridFtp could be a candidate here.
Calibration
 Run(ObservationId)
 GetDatasetLocation(ID)
Imaging
 Run(ObservationId)
 GetDatasetLocation(ID)
Analysis
 Run(ObservationId)
 GetDatasetLocation(ID)
Archive
 GetDatasetLocation(ID)
Logging
  // are you logging control messages here
Coordination
 DoSometing()
// coordinate the activities of the application, for example, it mightdo
  TelescopeControl.PerformObservation(someParames, "reading1")
  // when task is complete
  fromLocation = TelescopeControl.GetDatasetLocation("reading1")
  toLocation = Calibration.GetDatasetLocation("reading1")
  DataTransfer.Transfer(fromLocation, toLocation)
  Calibration.run("reading1")

etc.
The Calibration, Analysis, Imaging components are a bit tricky tovisualize.
Are they closely related or stand alone? Do they always have to run in
sequence in the same order? What sort of infrastructure do they rely on?For
example, you mention a supercomputer so are we talking MPI collectives and
Condor like schedulers. In which case an SCA component such as "Calibrate"
may just wrap the task of creating and submitting JSDL to the scheduler
rather than representing the Calibration code itself.  You may even resort
to a more generic "ComputeEngine" component that allows you to dynamically
configure jobs to be run.
Personally I would like Tuscany to be able to slot right in here so thatthe
analytical components could be supported in HPC environments with
appropriate SCA implementation types, bindings and integration with the
underlying HPC and grid infrastructure. We are not really there yet. I've
done some work on a LoadBalancer demo that shows Tuscany running in aTomcatcluster but scenarios like yours can really help us all think whatfeatures
would be appropriate.

Regards

Simon

Re: AW: Apache Tuscany doubts

Reply via email to