Hi Julie,
Yes for maximum performance, you should probably have a complete CSV format.

To maximize performance, I would suggest to skip the xmlDoc. Instead you 
will need to create a new Class call metro_csv2metro. That class would 
replace metro_dom2metro(metro_dom2metro.py). That way, you would skip 
all xml code.

In The metro_config.py file, the execution sequence would have to be 
changed:

    dConfig['INIT_MODULE_EXECUTION_SEQUENCE'] = \
        {'VALUE'   :["metro_read_forecast", (read the file and put it in 
a string)
                     "metro_validate_forecast", (validate the 
well-formatedness of the XML files) (this is optional)
                     "metro_string2dom_forecast", (create a XMLDOM from 
a string)
                     "metro_read_observation", (read the file and put it 
in a string)
                     "metro_validate_observation", (validate the 
well-formatedness of the XML files) (this is optional)
                     "metro_string2dom_observation", (create a XMLDOM 
from a string)
                     "metro_read_station", (read the file and put it in 
a string)
                     "metro_validate_station", (validate the 
well-formatedness of the XML files) (this is optional)
                     "metro_string2dom_station", (create a XMLDOM from a 
string)
                     "metro_dom2metro", (convert all XMLDOM to the metro 
data structure)
                      ...

would become:

    dConfig['INIT_MODULE_EXECUTION_SEQUENCE'] = \
        {'VALUE'   :["metro_read_forecast", (read the file and put it in 
a string)
                     "metro_validate_forecast", (validate the 
well-formatedness of the CSV files) (this is optional)
                     "metro_string2csv_forecast", (convert string to 
intermediate format)
                     "metro_read_observation", (read the file and put it 
in a string)
                     "metro_validate_observation", (validate the 
well-formatedness of the CSV files) (this is optional)
                     "metro_string2csv_observation", (convert string to 
intermediate format)
                     "metro_read_station", (read the file and put it in 
a string)
                     "metro_validate_station", (validate the 
well-formatedness of the CSV files) (this is optional)
                     "metro_string2csv_station", (convert string to 
intermediate format)
                     "metro_csv2metro", (convert intermediate format to 
the metro data structure)
                     ...

I would do it that way. I would also try to reuse the metro_config file 
as much as possible to read your CSV file. I would, for exemple, order 
my CSV data for forecast in the order provided by:

--------------------------------------------------------------------------------------------------
    dConfig['XML_FORECAST_PREDICTION_STANDARD_ITEMS'] = \
        {'VALUE' :[{'NAME':"FORECAST_TIME",
                    'XML_TAG':"forecast-time",
                    'DATA_TYPE':"DATE"},

                   {'NAME':"WS",
                    'XML_TAG':"ws",
                    'DATA_TYPE':"REAL"},

                   {'NAME':"AP",
                    'XML_TAG':"ap",
                    'DATA_TYPE':"REAL"},

                   {'NAME':"AT",
                    'XML_TAG':"at",
                    'DATA_TYPE':"REAL"},

                   {'NAME':"TD",
                    'XML_TAG':"td",
                    'DATA_TYPE':"REAL"},

                   {'NAME':"CC",
                    'XML_TAG':"cc",
                    'DATA_TYPE':"INTEGER"},

                   {'NAME':"SN",
                    'XML_TAG':"sn",
                    'DATA_TYPE':"REAL"},

                   {'NAME':"RA",
                    'XML_TAG':"ra",
                    'DATA_TYPE':"REAL"},
                   ],
         'FROM'     :CFG_INTERNAL,
         'COMMENTS' :_("standard forecast prediction items")}
----------------------------------------------------------------------------------
I would also use the 'DATA_TYPE' to convert string to the right type.

I hope its helping you. If you have any question, feel free to contact me.
François


Julie Prestopnik wrote:
> Hi François.  Regarding the mixture of XML and CSV, I'm guessing we
> probably wouldn't get much of an increase in performance that way, but
> I'm not sure.  I say that only because if all of the XML code still has
> to execute, it seems that it would take a similar amount of time as it
> does now.  I think the CSV format we had in mind would be similar to
> what you have below, except without the XML.
>
> Digging deeper in the code, it looks like the data is stored in an
> xmlDoc object.  I'm thinking we would need to create an xmlDoc object
> with our CSV data, but I'm not sure about that either.  From what you
> know of the code, does that seem accurate?  Do you have any
> input/suggestions about that?
>
> Thanks for your help,
> Julie
>
> Francois Fortin wrote:
>   
>> Hi Julie,
>> I tough I could easily add a CSV parser to my code. The format would
>> have been a mixture of XML and CSV. Here is an exemple:
>>
>> <?xml version="1.0"?>
>> <forecast>
>>    <header>
>>        <version>1.1</version>
>>        <production-date>2004-01-30T12:00Z</production-date>
>>    </header>
>>    <prediction-list>
>>       
>> <prediction>2004-01-30T12:00Z,22,1,-11.00,-14.00,0,0.00,0.00,984.00</prediction>
>>
>>       
>> <prediction>2004-01-30T13:00Z,20,1,-10.00,-13.00,7,0.00,0.00,988.00</prediction>
>>
>>       
>> <prediction>2004-01-30T14:00Z,20,1,-10.00,-13.00,7,0.00,0.00,988.00</prediction>
>>
>>       
>> <prediction>2004-01-30T15:00Z,20,1,-9.00,-13.00,7,0.00,0.00,988.00</prediction>
>>
>>       
>> <prediction>2004-01-30T16:00Z,20,1,-8.00,-13.00,7,0.00,0.00,989.00</prediction>
>>
>>       
>> <prediction>2004-01-30T17:00Z,20,1,-7.00,-12.00,7,0.00,0.00,989.00</prediction>
>>
>>    </prediction-list>
>> </forecast>
>>
>> Unfortunately this require a bigger effort then what tough. Also I'm not
>> sure you would gain enough performance from that modification. Do you
>> think it would be OK?
>>
>> Do you have an idea what your CSV format would be?
>>
>>
>>
>>
>> Julie Prestopnik wrote:
>>     
>>> Thanks, François!  I'll look forward to getting your suggestions.
>>>
>>> Julie
>>>
>>> Francois Fortin wrote:
>>>  
>>>       
>>>> Hi Julie,
>>>>
>>>> From now on we will do a reply to [EMAIL PROTECTED] This way our
>>>> discussion will be broadcast to other develloper.
>>>>
>>>> I would prefer method 1 or 2. If you prefer method 3 its not a problem
>>>> but maybe we can add method 2 also to force a particular file type.
>>>>
>>>> We would like to integrate it in METRo. It would be a great feature to
>>>> have.
>>>>
>>>> I will look at the source code a bit before sending you my suggestion. I
>>>> will try to do that by the end of the week.
>>>>
>>>> Thanks
>>>> François
>>>>
>>>> Julie Prestopnik wrote:
>>>>    
>>>>         
>>>>> Hi François.  I working on the design right now.  My plan was to use
>>>>> the
>>>>> python debugger to step through the METRo modules from the very
>>>>> beginning of program execution to see what modules would need to be
>>>>> modified/added.
>>>>>
>>>>> My co-workers and I discussed the various options for letting METRo
>>>>> know
>>>>> that it is being given a new type of input/output file and we came up
>>>>> with three possible ways:
>>>>>
>>>>> 1. Changing the command line options.  For example, having
>>>>> "--input-forecast-csv filename" instead of "--input-forecast filename"
>>>>> (Note the addition of the -csv at the end of the option)
>>>>>
>>>>> 2. Adding another option to METRo.  For example, "--input-file-format
>>>>> filetype" and "--output-file-format filetype", with filetype being
>>>>> either "xml" or "csv".
>>>>>
>>>>> 3. Pushing the check further down.  For example, once the filenames are
>>>>> loaded in, check for a .xml or .csv extension and proceed as necessary.
>>>>>
>>>>> I don't think we really have a preference on which one to use.
>>>>> Personally, I'm somewhat fond of the third option, because the
>>>>> interface
>>>>> to METRo would not change, however, then it restricts the user in their
>>>>> choice of input and output filenames.  Do you have a preference?
>>>>>
>>>>> Regarding your question of knowing exactly what to do to allow for the
>>>>> different input format, like I said, I've started using the debugger to
>>>>> step through the code and have been making notes along the way of what
>>>>> code appears to need modification.  Then, I was going to dive in, start
>>>>> making changes, and hope for the best.  ;)
>>>>>
>>>>> Any suggestions from you would certainly be appreciated.  Is this
>>>>> something you would consider integrating into and maintaining in
>>>>> MetSurface if I get it working?
>>>>>
>>>>> Thanks,
>>>>> Julie
>>>>>
>>>>> Francois Fortin wrote:
>>>>>  
>>>>>      
>>>>>           
>>>>>> Hi Julie,
>>>>>> Do you have an idea how to do that? I have one but I will need some
>>>>>> time
>>>>>> to look at that.
>>>>>>
>>>>>> Julie Prestopnik wrote:
>>>>>>           
>>>>>>             
>>>>>>> Hello METRo developers.
>>>>>>>
>>>>>>> We (NCAR) are considering adding an option to allow for a new input
>>>>>>> format (CSV) to METRo, since the XML I/O consumes most of the
>>>>>>> processing
>>>>>>> time.  We'd like to add an option, either by adding to or changing
>>>>>>> the
>>>>>>> command line interface or by pushing the check further down (e.g.
>>>>>>> checking for file extension: .xml or .csv).
>>>>>>>
>>>>>>> We wanted to run this by everyone to see if you might have any input,
>>>>>>> suggestions, or objections.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Julie
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> METRo-developers mailing list
>>>>>>> [email protected]
>>>>>>> https://mail.gna.org/listinfo/metro-developers
>>>>>>>                   
>>>>>>>               
>>>>>         
>>>>>           
>>>   
>>>       
>
>   

-- 
François Fortin
Programmeur analyste scientifique
(514)421-7245 


_______________________________________________
METRo-developers mailing list
[email protected]
https://mail.gna.org/listinfo/metro-developers

Reply via email to