XML parsing (loads of data question)...
If I have large amounts of XML data (say 30-100MB), which I need to parse. The first thought was to load it into an array, at start-up and then get the keys, and loop through each key (using repeat for each line) - testing to see if there is a match and returning the value if there is. However I figure that this would nearly double the amount of memory that I would need and the full index of keys would be almost as large as the entire index. As there is no syntax for referring to elements in an associative array by numerical index (ie get the first, second etc), what would be the fastest, and most memory efficient technique? Should I use an external database like "Pandora" or whatever the name beginning with P I am searching for is? maybe i need my own relational database -:) Archives: http://www.mail-archive.com/metacard%40lists.best.com/ Info: http://www.xworlds.com/metacard/mailinglist.htm Please send bug reports to [EMAIL PROTECTED], not this list.
Re: XML parsing (loads of data question)...
I'm looking towards Valentina (http://www.paradigmasoft.com) for a relational database solution for MC. However you need to get the data from the format its in, to the format suitable for Valentina import. A solution I've use in the past is to index the records into a file structure depending on the key, or keys. A converter would be written to split the data into directories and subdirectories so that smaller files exist in defined locations. eg to search on a surname "Davis" MC could look into C:/d/a/v/info.dat which should contain a relatively small file with names such as Davis, Davies, Davison, Davidson etc. and therefore much easier and quicker to handle. Other issues writing to this file structure 'database' and/or how often you need to convert the original/updated data. Gary Rathbone on 9/21/00 8:48 AM, David Bovill at [EMAIL PROTECTED] wrote: If I have large amounts of XML data (say 30-100MB), which I need to parse. The first thought was to load it into an array, at start-up and then get the keys, and loop through each key (using repeat for each line) - testing to see if there is a match and returning the value if there is. However I figure that this would nearly double the amount of memory that I would need and the full index of keys would be almost as large as the entire index. As there is no syntax for referring to elements in an associative array by numerical index (ie get the first, second etc), what would be the fastest, and most memory efficient technique? Should I use an external database like "Pandora" or whatever the name beginning with P I am searching for is? maybe i need my own relational database -:) Archives: http://www.mail-archive.com/metacard%40lists.best.com/ Info: http://www.xworlds.com/metacard/mailinglist.htm Please send bug reports to [EMAIL PROTECTED], not this list. Archives: http://www.mail-archive.com/metacard%40lists.best.com/ Info: http://www.xworlds.com/metacard/mailinglist.htm Please send bug reports to [EMAIL PROTECTED], not this list.
Re: XML parsing (loads of data question)...
Thanks Gary... From: Gary Rathbone [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] Date: Thu, 21 Sep 2000 14:23:55 +0100 To: [EMAIL PROTECTED] Subject: Re: XML parsing (loads of data question)... I'm looking towards Valentina (http://www.paradigmasoft.com) for a relational database solution for MC. How implications does working with Valentina have for memory issues. I ask because one of the reasons for converting the project from Java is to get it working on Macs with under 50MB of RAM. However you need to get the data from the format its in, to the format suitable for Valentina import. What sort of format is required? Tab delimited? A solution I've use in the past is to index the records into a file structure depending on the key, or keys. A converter would be written to split the data into directories and subdirectories so that smaller files exist in defined locations. eg to search on a surname "Davis" MC could look into C:/d/a/v/info.dat which should contain a relatively small file with names such as Davis, Davies, Davison, Davidson etc. and therefore much easier and quicker to handle. Other issues writing to this file structure 'database' and/or how often you need to convert the original/updated data. Thanks for the tip. If I have large amounts of XML data (say 30-100MB), which I need to parse. The first thought was to load it into an array, at start-up and then get the keys, and loop through each key (using repeat for each line) - testing to see if there is a match and returning the value if there is. However I figure that this would nearly double the amount of memory that I would need and the full index of keys would be almost as large as the entire index. As there is no syntax for referring to elements in an associative array by numerical index (ie get the first, second etc), what would be the fastest, and most memory efficient technique? Should I use an external database like "Pandora" or whatever the name beginning with P I am searching for is? maybe i need my own relational database -:) Archives: http://www.mail-archive.com/metacard%40lists.best.com/ Info: http://www.xworlds.com/metacard/mailinglist.htm Please send bug reports to [EMAIL PROTECTED], not this list. Archives: http://www.mail-archive.com/metacard%40lists.best.com/ Info: http://www.xworlds.com/metacard/mailinglist.htm Please send bug reports to [EMAIL PROTECTED], not this list. Archives: http://www.mail-archive.com/metacard%40lists.best.com/ Info: http://www.xworlds.com/metacard/mailinglist.htm Please send bug reports to [EMAIL PROTECTED], not this list.
Re: XML parsing (loads of data question)...
Can't help too much on the Valentina specifics as I'm evaluating it myself. Suggest you take a look at the site http://www.paradigmasoft.com. and ask the guys there (they've been very helpful). I know Scott endorses it ... Post dated 3/9/2000 --snip-- Anyone building (or contemplating building) applications that require managing more than a few MB of data or that require sophisticated query support should download [valentina] --snip-- --snip-- This is a key technology for MetaCard developers to have in their arsenal and we need to do what it takes to make sure that it works well. Regards, Scott [Raney] --snip-- on 21/9/00 15:20, David Bovill at [EMAIL PROTECTED] wrote: Thanks Gary... From: Gary Rathbone [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] Date: Thu, 21 Sep 2000 14:23:55 +0100 To: [EMAIL PROTECTED] Subject: Re: XML parsing (loads of data question)... I'm looking towards Valentina (http://www.paradigmasoft.com) for a relational database solution for MC. How implications does working with Valentina have for memory issues. I ask because one of the reasons for converting the project from Java is to get it working on Macs with under 50MB of RAM. However you need to get the data from the format its in, to the format suitable for Valentina import. What sort of format is required? Tab delimited? A solution I've use in the past is to index the records into a file structure depending on the key, or keys. A converter would be written to split the data into directories and subdirectories so that smaller files exist in defined locations. eg to search on a surname "Davis" MC could look into C:/d/a/v/info.dat which should contain a relatively small file with names such as Davis, Davies, Davison, Davidson etc. and therefore much easier and quicker to handle. Other issues writing to this file structure 'database' and/or how often you need to convert the original/updated data. Thanks for the tip. If I have large amounts of XML data (say 30-100MB), which I need to parse. The first thought was to load it into an array, at start-up and then get the keys, and loop through each key (using repeat for each line) - testing to see if there is a match and returning the value if there is. However I figure that this would nearly double the amount of memory that I would need and the full index of keys would be almost as large as the entire index. As there is no syntax for referring to elements in an associative array by numerical index (ie get the first, second etc), what would be the fastest, and most memory efficient technique? Should I use an external database like "Pandora" or whatever the name beginning with P I am searching for is? maybe i need my own relational database -:) Archives: http://www.mail-archive.com/metacard%40lists.best.com/ Info: http://www.xworlds.com/metacard/mailinglist.htm Please send bug reports to [EMAIL PROTECTED], not this list. Archives: http://www.mail-archive.com/metacard%40lists.best.com/ Info: http://www.xworlds.com/metacard/mailinglist.htm Please send bug reports to [EMAIL PROTECTED], not this list. Archives: http://www.mail-archive.com/metacard%40lists.best.com/ Info: http://www.xworlds.com/metacard/mailinglist.htm Please send bug reports to [EMAIL PROTECTED], not this list. Archives: http://www.mail-archive.com/metacard%40lists.best.com/ Info: http://www.xworlds.com/metacard/mailinglist.htm Please send bug reports to [EMAIL PROTECTED], not this list.