Hi François. Thanks for your help. I'm looking into this more now.
Can you please explain what the code in data_module/metro_infdata.py and
data_module/metro_infdata_container.py does?
Also, in your notes it says that metro_string2dom (forecast, observation
and station) "creates a XMLDOM from a string", and you indicate that we
would need to create a metro_string2csv set of files to convert the
string into an intermediate format. Can you describe in more detail
what a XMLDOM looks like/contains?
Thanks for your help in advance.
Julie
Francois Fortin wrote:
> Hi Julie,
> Yes for maximum performance, you should probably have a complete CSV
> format.
>
> To maximize performance, I would suggest to skip the xmlDoc. Instead you
> will need to create a new Class call metro_csv2metro. That class would
> replace metro_dom2metro(metro_dom2metro.py). That way, you would skip
> all xml code.
>
> In The metro_config.py file, the execution sequence would have to be
> changed:
>
> dConfig['INIT_MODULE_EXECUTION_SEQUENCE'] = \
> {'VALUE' :["metro_read_forecast", (read the file and put it in
> a string)
> "metro_validate_forecast", (validate the
> well-formatedness of the XML files) (this is optional)
> "metro_string2dom_forecast", (create a XMLDOM from a
> string)
> "metro_read_observation", (read the file and put it
> in a string)
> "metro_validate_observation", (validate the
> well-formatedness of the XML files) (this is optional)
> "metro_string2dom_observation", (create a XMLDOM
> from a string)
> "metro_read_station", (read the file and put it in a
> string)
> "metro_validate_station", (validate the
> well-formatedness of the XML files) (this is optional)
> "metro_string2dom_station", (create a XMLDOM from a
> string)
> "metro_dom2metro", (convert all XMLDOM to the metro
> data structure)
> ...
>
> would become:
>
> dConfig['INIT_MODULE_EXECUTION_SEQUENCE'] = \
> {'VALUE' :["metro_read_forecast", (read the file and put it in
> a string)
> "metro_validate_forecast", (validate the
> well-formatedness of the CSV files) (this is optional)
> "metro_string2csv_forecast", (convert string to
> intermediate format)
> "metro_read_observation", (read the file and put it
> in a string)
> "metro_validate_observation", (validate the
> well-formatedness of the CSV files) (this is optional)
> "metro_string2csv_observation", (convert string to
> intermediate format)
> "metro_read_station", (read the file and put it in a
> string)
> "metro_validate_station", (validate the
> well-formatedness of the CSV files) (this is optional)
> "metro_string2csv_station", (convert string to
> intermediate format)
> "metro_csv2metro", (convert intermediate format to
> the metro data structure)
> ...
>
> I would do it that way. I would also try to reuse the metro_config file
> as much as possible to read your CSV file. I would, for exemple, order
> my CSV data for forecast in the order provided by:
>
> --------------------------------------------------------------------------------------------------
>
> dConfig['XML_FORECAST_PREDICTION_STANDARD_ITEMS'] = \
> {'VALUE' :[{'NAME':"FORECAST_TIME",
> 'XML_TAG':"forecast-time",
> 'DATA_TYPE':"DATE"},
>
> {'NAME':"WS",
> 'XML_TAG':"ws",
> 'DATA_TYPE':"REAL"},
>
> {'NAME':"AP",
> 'XML_TAG':"ap",
> 'DATA_TYPE':"REAL"},
>
> {'NAME':"AT",
> 'XML_TAG':"at",
> 'DATA_TYPE':"REAL"},
>
> {'NAME':"TD",
> 'XML_TAG':"td",
> 'DATA_TYPE':"REAL"},
>
> {'NAME':"CC",
> 'XML_TAG':"cc",
> 'DATA_TYPE':"INTEGER"},
>
> {'NAME':"SN",
> 'XML_TAG':"sn",
> 'DATA_TYPE':"REAL"},
>
> {'NAME':"RA",
> 'XML_TAG':"ra",
> 'DATA_TYPE':"REAL"},
> ],
> 'FROM' :CFG_INTERNAL,
> 'COMMENTS' :_("standard forecast prediction items")}
> ----------------------------------------------------------------------------------
>
> I would also use the 'DATA_TYPE' to convert string to the right type.
>
> I hope its helping you. If you have any question, feel free to contact me.
> François
>
>
> Julie Prestopnik wrote:
>> Hi François. Regarding the mixture of XML and CSV, I'm guessing we
>> probably wouldn't get much of an increase in performance that way, but
>> I'm not sure. I say that only because if all of the XML code still has
>> to execute, it seems that it would take a similar amount of time as it
>> does now. I think the CSV format we had in mind would be similar to
>> what you have below, except without the XML.
>>
>> Digging deeper in the code, it looks like the data is stored in an
>> xmlDoc object. I'm thinking we would need to create an xmlDoc object
>> with our CSV data, but I'm not sure about that either. From what you
>> know of the code, does that seem accurate? Do you have any
>> input/suggestions about that?
>>
>> Thanks for your help,
>> Julie
>>
>> Francois Fortin wrote:
>>
>>> Hi Julie,
>>> I tough I could easily add a CSV parser to my code. The format would
>>> have been a mixture of XML and CSV. Here is an exemple:
>>>
>>> <?xml version="1.0"?>
>>> <forecast>
>>> <header>
>>> <version>1.1</version>
>>> <production-date>2004-01-30T12:00Z</production-date>
>>> </header>
>>> <prediction-list>
>>>
>>> <prediction>2004-01-30T12:00Z,22,1,-11.00,-14.00,0,0.00,0.00,984.00</prediction>
>>>
>>>
>>>
>>> <prediction>2004-01-30T13:00Z,20,1,-10.00,-13.00,7,0.00,0.00,988.00</prediction>
>>>
>>>
>>>
>>> <prediction>2004-01-30T14:00Z,20,1,-10.00,-13.00,7,0.00,0.00,988.00</prediction>
>>>
>>>
>>>
>>> <prediction>2004-01-30T15:00Z,20,1,-9.00,-13.00,7,0.00,0.00,988.00</prediction>
>>>
>>>
>>>
>>> <prediction>2004-01-30T16:00Z,20,1,-8.00,-13.00,7,0.00,0.00,989.00</prediction>
>>>
>>>
>>>
>>> <prediction>2004-01-30T17:00Z,20,1,-7.00,-12.00,7,0.00,0.00,989.00</prediction>
>>>
>>>
>>> </prediction-list>
>>> </forecast>
>>>
>>> Unfortunately this require a bigger effort then what tough. Also I'm not
>>> sure you would gain enough performance from that modification. Do you
>>> think it would be OK?
>>>
>>> Do you have an idea what your CSV format would be?
>>>
>>>
>>>
>>>
>>> Julie Prestopnik wrote:
>>>
>>>> Thanks, François! I'll look forward to getting your suggestions.
>>>>
>>>> Julie
>>>>
>>>> Francois Fortin wrote:
>>>>
>>>>
>>>>> Hi Julie,
>>>>>
>>>>> From now on we will do a reply to [EMAIL PROTECTED] This
>>>>> way our
>>>>> discussion will be broadcast to other develloper.
>>>>>
>>>>> I would prefer method 1 or 2. If you prefer method 3 its not a problem
>>>>> but maybe we can add method 2 also to force a particular file type.
>>>>>
>>>>> We would like to integrate it in METRo. It would be a great feature to
>>>>> have.
>>>>>
>>>>> I will look at the source code a bit before sending you my
>>>>> suggestion. I
>>>>> will try to do that by the end of the week.
>>>>>
>>>>> Thanks
>>>>> François
>>>>>
>>>>> Julie Prestopnik wrote:
>>>>>
>>>>>> Hi François. I working on the design right now. My plan was to use
>>>>>> the
>>>>>> python debugger to step through the METRo modules from the very
>>>>>> beginning of program execution to see what modules would need to be
>>>>>> modified/added.
>>>>>>
>>>>>> My co-workers and I discussed the various options for letting METRo
>>>>>> know
>>>>>> that it is being given a new type of input/output file and we came up
>>>>>> with three possible ways:
>>>>>>
>>>>>> 1. Changing the command line options. For example, having
>>>>>> "--input-forecast-csv filename" instead of "--input-forecast
>>>>>> filename"
>>>>>> (Note the addition of the -csv at the end of the option)
>>>>>>
>>>>>> 2. Adding another option to METRo. For example, "--input-file-format
>>>>>> filetype" and "--output-file-format filetype", with filetype being
>>>>>> either "xml" or "csv".
>>>>>>
>>>>>> 3. Pushing the check further down. For example, once the
>>>>>> filenames are
>>>>>> loaded in, check for a .xml or .csv extension and proceed as
>>>>>> necessary.
>>>>>>
>>>>>> I don't think we really have a preference on which one to use.
>>>>>> Personally, I'm somewhat fond of the third option, because the
>>>>>> interface
>>>>>> to METRo would not change, however, then it restricts the user in
>>>>>> their
>>>>>> choice of input and output filenames. Do you have a preference?
>>>>>>
>>>>>> Regarding your question of knowing exactly what to do to allow for
>>>>>> the
>>>>>> different input format, like I said, I've started using the
>>>>>> debugger to
>>>>>> step through the code and have been making notes along the way of
>>>>>> what
>>>>>> code appears to need modification. Then, I was going to dive in,
>>>>>> start
>>>>>> making changes, and hope for the best. ;)
>>>>>>
>>>>>> Any suggestions from you would certainly be appreciated. Is this
>>>>>> something you would consider integrating into and maintaining in
>>>>>> MetSurface if I get it working?
>>>>>>
>>>>>> Thanks,
>>>>>> Julie
>>>>>>
>>>>>> Francois Fortin wrote:
>>>>>>
>>>>>>
>>>>>>> Hi Julie,
>>>>>>> Do you have an idea how to do that? I have one but I will need some
>>>>>>> time
>>>>>>> to look at that.
>>>>>>>
>>>>>>> Julie Prestopnik wrote:
>>>>>>>
>>>>>>>> Hello METRo developers.
>>>>>>>>
>>>>>>>> We (NCAR) are considering adding an option to allow for a new input
>>>>>>>> format (CSV) to METRo, since the XML I/O consumes most of the
>>>>>>>> processing
>>>>>>>> time. We'd like to add an option, either by adding to or changing
>>>>>>>> the
>>>>>>>> command line interface or by pushing the check further down (e.g.
>>>>>>>> checking for file extension: .xml or .csv).
>>>>>>>>
>>>>>>>> We wanted to run this by everyone to see if you might have any
>>>>>>>> input,
>>>>>>>> suggestions, or objections.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Julie
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> METRo-developers mailing list
>>>>>>>> [email protected]
>>>>>>>> https://mail.gna.org/listinfo/metro-developers
>>>>>>>>
>>>>>>
>>>>
>>
>>
>
_______________________________________________
METRo-developers mailing list
[email protected]
https://mail.gna.org/listinfo/metro-developers