Ok. So if I understand correctly, I need: PC1 => HMaster (HBase), JobTracker (Hadoop), Name Node (Hadoop), and ZooKeeper (ZK) PC2 => Secondary Name Node (Hadoop) PC3 to x => Data Node (Hadoop), Task Tracker (Hadoop), Restion Server (HBase)
For PC2, should I run Zookeeper, JobTracker and master too? Can I have 2 masters? Or I just run just the secondray name node? 2012/6/21, Michael Segel <[email protected]>: > If you have a really small cluster... > You can put your HMaster, JobTracker, Name Node, and ZooKeeper all on a > single node. (Secondary too) > Then you have Data Nodes that run DN, TT, and RS. > > That would solve any ZK RS problems. > > On Jun 21, 2012, at 6:43 AM, Jean-Marc Spaggiari wrote: > >> Hi Mike, Hi Rob, >> >> Thanks for your replies and advices. Seems that now I'm due for some >> implementation. I'm readgin Lars' book first and when I will be done I >> will start with the coding. >> >> I already have my Zookeeper/Hadoop/HBase running and based on the >> first pages I read, I already know it's not well done since I have put >> a DataNode and a Zookeeper server on ALL the servers ;) So. More >> reading for me for the next few days, and then I will start. >> >> Thanks again! >> >> JM >> >> 2012/6/16, Rob Verkuylen <[email protected]>: >>> Just to add from my experiences: >>> >>> Yes hotspotting is bad, but so are devops headaches. A reasonable >>> machine >>> can handle 3-4000 puts a second with ease, and a simple timerange scan >>> can >>> give you the records you need. I have my doubts you will be hitting >>> these >>> amounts anytime soon. A simple setup will get your PoC and then scale >>> when >>> you need to scale. >>> >>> Rob >>> >>> On Sat, Jun 16, 2012 at 6:33 PM, Michael Segel >>> <[email protected]>wrote: >>> >>>> Jean-Marc, >>>> >>>> You indicated that you didn't want to do full table scans when you want >>>> to >>>> find out which files hadn't been touched since X time has past. >>>> (X could be months, weeks, days, hours, etc ...) >>>> >>>> So here's the thing. >>>> First, I am not convinced that you will have hot spotting. >>>> Second, you end up having to now do 26 scans instead of one. Then you >>>> need >>>> to join the result set. >>>> >>>> Not really a good solution if you think about it. >>>> >>>> Oh and I don't believe that you will be hitting a single region, >>>> although >>>> you may hit a region hard. >>>> (Your second table's key is on the timestamp of the last update to the >>>> file. If the file hadn't been touched in a week, there's the >>>> probability >>>> that at scale, it won't be in the same region as a file that had >>>> recently >>>> been touched. ) >>>> >>>> I wouldn't recommend HBaseWD. Its cute, its not novel, and can only be >>>> applied on a subset of problems. >>>> (Think round-robin partitioning in a RDBMS. DB2 was big on this.) >>>> >>>> HTH >>>> >>>> -Mike >>>> >>>> >>>> >>>> On Jun 16, 2012, at 9:42 AM, Jean-Marc Spaggiari wrote: >>>> >>>>> Let's imagine the timestamp is "123456789". >>>>> >>>>> If I salt it with later from 'a' to 'z' them it will always be split >>>>> between few RegionServers. I will have like "t123456789". The issue is >>>>> that I will have to do 26 queries to be able to find all the entries. >>>>> I will need to query from A000000000 to Axxxxxxxxx, then same for B, >>>>> and so on. >>>>> >>>>> So what's worst? Am I better to deal with the hotspotting? Salt the >>>>> key myself? Or what if I use something like HBaseWD? >>>>> >>>>> JM >>>>> >>>>> 2012/6/16, Michel Segel <[email protected]>: >>>>>> You can't salt the key in the second table. >>>>>> By salting the key, you lose the ability to do range scans, which is >>>> what >>>>>> you want to do. >>>>>> >>>>>> >>>>>> >>>>>> Sent from a remote device. Please excuse any typos... >>>>>> >>>>>> Mike Segel >>>>>> >>>>>> On Jun 16, 2012, at 6:22 AM, Jean-Marc Spaggiari < >>>> [email protected]> >>>>>> wrote: >>>>>> >>>>>>> Thanks all for your comments and suggestions. Regarding the >>>>>>> hotspotting I will try to salt the key in the 2nd table and see the >>>>>>> results. >>>>>>> >>>>>>> Yesterday I finished to install my 4 servers cluster with old >>>>>>> machine. >>>>>>> It's slow, but it's working. So I will do some testing. >>>>>>> >>>>>>> You are recommending to modify the timestamp to be to the second or >>>>>>> minute and have more entries per row. Is that because it's better to >>>>>>> have more columns than rows? Or it's more because that will allow to >>>>>>> have a more "squarred" pattern (lot of rows, lot of colums) which if >>>>>>> more efficient? >>>>>>> >>>>>>> JM >>>>>>> >>>>>>> 2012/6/15, Michael Segel <[email protected]>: >>>>>>>> Thought about this a little bit more... >>>>>>>> >>>>>>>> You will want two tables for a solution. >>>>>>>> >>>>>>>> 1 Table is Key: Unique ID >>>>>>>> Column: FilePath Value: Full Path to >>>>>>>> file >>>>>>>> Column: Last Update time Value: timestamp >>>>>>>> >>>>>>>> 2 Table is Key: Last Update time (The timestamp) >>>>>>>> Column 1-N: Unique ID Value: Full Path >>>>>>>> to >>>>>>>> the >>>>>>>> file >>>>>>>> >>>>>>>> Now if you want to get fancy, in Table 1, you could use the time >>>> stamp >>>>>>>> on >>>>>>>> the column File Path to hold the last update time. >>>>>>>> But its probably easier for you to start by keeping the data as a >>>>>>>> separate >>>>>>>> column and ignore the Timestamps on the columns for now. >>>>>>>> >>>>>>>> Note the following: >>>>>>>> >>>>>>>> 1) I used the notation Column 1-N to reflect that for a given >>>> timestamp >>>>>>>> you >>>>>>>> may or may not have multiple files that were updated. (You weren't >>>>>>>> specific >>>>>>>> as to the scale) >>>>>>>> This is a good example of HBase's column oriented approach where >>>>>>>> you >>>> may >>>>>>>> or >>>>>>>> may not have a column. It doesn't matter. :-) You could also modify >>>> the >>>>>>>> timestamp to be to the second or minute and have more entries per >>>>>>>> row. >>>>>>>> It >>>>>>>> doesn't matter. You insert based on timestamp:columnName, value, so >>>> you >>>>>>>> will >>>>>>>> add a column to this table. >>>>>>>> >>>>>>>> 2) First prove that the logic works. You insert/update table 1 to >>>>>>>> capture >>>>>>>> the ID of the file and its last update time. You then delete the >>>>>>>> old >>>>>>>> timestamp entry in table 2, then insert new entry in table 2. >>>>>>>> >>>>>>>> 3) You store Table 2 in ascending order. Then when you want to find >>>> your >>>>>>>> last 500 entries, you do a start scan at 0x000 and then limit the >>>>>>>> scan >>>>>>>> to >>>>>>>> 500 rows. Note that you may or may not have multiple entries so as >>>>>>>> you >>>>>>>> walk >>>>>>>> through the result set, you count the number of columns and stop >>>>>>>> when >>>>>>>> you >>>>>>>> have 500 columns, regardless of the number of rows you've >>>>>>>> processed. >>>>>>>> >>>>>>>> This should solve your problem and be pretty efficient. >>>>>>>> You can then work out the Coprocessors and add it to the solution >>>>>>>> to >>>> be >>>>>>>> even >>>>>>>> more efficient. >>>>>>>> >>>>>>>> >>>>>>>> With respect to 'hot-spotting' , can't be helped. You could hash >>>>>>>> your >>>>>>>> unique >>>>>>>> ID in table 1, this will reduce the potential of a hotspot as the >>>> table >>>>>>>> splits. >>>>>>>> On table 2, because you have temporal data and you want to >>>>>>>> efficiently >>>>>>>> scan >>>>>>>> a small portion of the table based on size, you will always scan >>>>>>>> the >>>>>>>> first >>>>>>>> bloc, however as data rolls off and compression occurs, you will >>>>>>>> probably >>>>>>>> have to do some cleanup. I'm not sure how HBase handles splits >>>>>>>> that >>>> no >>>>>>>> longer contain data. When you compress an empty split, does it go >>>> away? >>>>>>>> >>>>>>>> By switching to coprocessors, you now limit the update accessors to >>>> the >>>>>>>> second table so you should still have pretty good performance. >>>>>>>> >>>>>>>> You may also want to look at Asynchronous HBase, however I don't >>>>>>>> know >>>>>>>> how >>>>>>>> well it will work with Coprocessors or if you want to perform async >>>>>>>> operations in this specific use case. >>>>>>>> >>>>>>>> Good luck, HTH... >>>>>>>> >>>>>>>> -Mike >>>>>>>> >>>>>>>> On Jun 14, 2012, at 1:47 PM, Jean-Marc Spaggiari wrote: >>>>>>>> >>>>>>>>> Hi Michael, >>>>>>>>> >>>>>>>>> For now this is more a proof of concept than a production >>>> application. >>>>>>>>> And if it's working, it should be growing a lot and database at >>>>>>>>> the >>>>>>>>> end will easily be over 1B rows. each individual server will have >>>>>>>>> to >>>>>>>>> send it's own information to one centralized server which will >>>>>>>>> insert >>>>>>>>> that into a database. That's why it need to be very quick and >>>>>>>>> that's >>>>>>>>> why I'm looking in HBase's direction. I tried with some relational >>>>>>>>> databases with 4M rows in the table but the insert time is to slow >>>>>>>>> when I have to introduce entries in bulk. Also, the ability for >>>>>>>>> HBase >>>>>>>>> to keep only the cells with values will allow to save a lot on the >>>>>>>>> disk space (futur projects). >>>>>>>>> >>>>>>>>> I'm not yet used with HBase and there is still many things I need >>>>>>>>> to >>>>>>>>> undertsand but until I'm able to create a solution and test it, I >>>> will >>>>>>>>> continue to read, learn and try that way. Then at then end I will >>>>>>>>> be >>>>>>>>> able to compare the 2 options I have (HBase or relational) and >>>>>>>>> decide >>>>>>>>> based on the results. >>>>>>>>> >>>>>>>>> So yes, your reply helped because it's giving me a way to achieve >>>> this >>>>>>>>> goal (using co-processors). I don't know ye thow this part is >>>> working, >>>>>>>>> so I will dig the documentation for it. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> JM >>>>>>>>> >>>>>>>>> 2012/6/14, Michael Segel <[email protected]>: >>>>>>>>>> Jean-Marc, >>>>>>>>>> >>>>>>>>>> You do realize that this really isn't a good use case for HBase, >>>>>>>>>> assuming >>>>>>>>>> that what you are describing is a stand alone system. >>>>>>>>>> It would be easier and better if you just used a simple >>>>>>>>>> relational >>>>>>>>>> database. >>>>>>>>>> >>>>>>>>>> Then you would have your table w an ID, and a secondary index on >>>>>>>>>> the >>>>>>>>>> timestamp. >>>>>>>>>> Retrieve the data in Ascending order by timestamp and take the >>>>>>>>>> top >>>> 500 >>>>>>>>>> off >>>>>>>>>> the list. >>>>>>>>>> >>>>>>>>>> If you insist on using HBase, yes you will have to have a >>>>>>>>>> secondary >>>>>>>>>> table. >>>>>>>>>> Then using co-processors... >>>>>>>>>> When you update the row in your base table, you >>>>>>>>>> then get() the row in your index by timestamp, removing the >>>>>>>>>> column >>>> for >>>>>>>>>> that >>>>>>>>>> rowid. >>>>>>>>>> Add the new column to the timestamp row. >>>>>>>>>> >>>>>>>>>> As you put it. >>>>>>>>>> >>>>>>>>>> Now you can just do a partial scan on your index. Because your >>>>>>>>>> index >>>>>>>>>> table >>>>>>>>>> is so small... you shouldn't worry about hotspots. >>>>>>>>>> You may just want to rebuild your index every so often... >>>>>>>>>> >>>>>>>>>> HTH >>>>>>>>>> >>>>>>>>>> -Mike >>>>>>>>>> >>>>>>>>>> On Jun 14, 2012, at 7:22 AM, Jean-Marc Spaggiari wrote: >>>>>>>>>> >>>>>>>>>>> Hi Michael, >>>>>>>>>>> >>>>>>>>>>> Thanks for your feedback. Here are more details to describe what >>>> I'm >>>>>>>>>>> trying to achieve. >>>>>>>>>>> >>>>>>>>>>> My goal is to store information about files into the database. I >>>> need >>>>>>>>>>> to check the oldest files in the database to refresh the >>>> information. >>>>>>>>>>> >>>>>>>>>>> The key is an 8 bytes ID of the server name in the network >>>>>>>>>>> hosting >>>>>>>>>>> the >>>>>>>>>>> file + MD5 of the file path. Total is a 24 bytes key. >>>>>>>>>>> >>>>>>>>>>> So each time I look at a file and gather the information, I >>>>>>>>>>> update >>>>>>>>>>> its >>>>>>>>>>> row in the database based on the key including a "last_update" >>>> field. >>>>>>>>>>> I can calculate this key for any file in the drives. >>>>>>>>>>> >>>>>>>>>>> In order to know which file I need to check in the network, I >>>>>>>>>>> need >>>> to >>>>>>>>>>> scan the table by "last_update" field. So the idea is to build >>>>>>>>>>> another >>>>>>>>>>> table which contain the last_update as a key and the files IDs >>>>>>>>>>> in >>>>>>>>>>> columns. (Here is the hotspotting) >>>>>>>>>>> >>>>>>>>>>> Each time I work on a file, I will have to update the main table >>>>>>>>>>> by >>>>>>>>>>> ID >>>>>>>>>>> and remove the cell from the second table (the index) and put it >>>> back >>>>>>>>>>> with the new "last_update" key. >>>>>>>>>>> >>>>>>>>>>> I'm mainly doing 3 operations in the database. >>>>>>>>>>> 1) I retrieve a list of 500 files which need to be update >>>>>>>>>>> 2) I update the information for those 500 files (bulk update) >>>>>>>>>>> 3) I load new files references to be checked. >>>>>>>>>>> >>>>>>>>>>> For 2 and 3, I use the main table with the file ID as the key. >>>>>>>>>>> the >>>>>>>>>>> distribution is almost perfect because I'm using hash. The >>>>>>>>>>> prefix >>>> is >>>>>>>>>>> the server ID but it's not always going to the same server since >>>> it's >>>>>>>>>>> done by last_update. But this allow a quick access to the list >>>>>>>>>>> of >>>>>>>>>>> files from one server. >>>>>>>>>>> For 1, I have expected to build this second table with the >>>>>>>>>>> "last_update" as the key. >>>>>>>>>>> >>>>>>>>>>> Regarding the frequency, it really depends on the activities on >>>>>>>>>>> the >>>>>>>>>>> network, but it should be "often". The faster the database >>>>>>>>>>> update >>>>>>>>>>> will be, the more up to date I will be able to keep it. >>>>>>>>>>> >>>>>>>>>>> JM >>>>>>>>>>> >>>>>>>>>>> 2012/6/14, Michael Segel <[email protected]>: >>>>>>>>>>>> Actually I think you should revisit your key design.... >>>>>>>>>>>> >>>>>>>>>>>> Look at your access path to the data for each of the types of >>>>>>>>>>>> queries >>>>>>>>>>>> you >>>>>>>>>>>> are going to run. >>>>>>>>>>>> From your post: >>>>>>>>>>>> "I have a table with a uniq key, a file path and a "last >>>>>>>>>>>> update" >>>>>>>>>>>> field. >>>>>>>>>>>>>>> I can easily find back the file with the ID and find when it >>>> has >>>>>>>>>>>>>>> been >>>>>>>>>>>>>>> updated. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> But what I need too is to find the files not updated for >>>>>>>>>>>>>>> more >>>>>>>>>>>>>>> than >>>>>>>>>>>>>>> a >>>>>>>>>>>>>>> certain period of time. >>>>>>>>>>>> " >>>>>>>>>>>> So your primary query is going to be against the key. >>>>>>>>>>>> Not sure if you meant to say that your key was a composite key >>>>>>>>>>>> or >>>>>>>>>>>> not... >>>>>>>>>>>> sounds like your key is just the unique key and the rest are >>>> columns >>>>>>>>>>>> in >>>>>>>>>>>> the >>>>>>>>>>>> table. >>>>>>>>>>>> >>>>>>>>>>>> The secondary query or path to the data is to find data where >>>>>>>>>>>> the >>>>>>>>>>>> files >>>>>>>>>>>> were >>>>>>>>>>>> not updated for more than a period of time. >>>>>>>>>>>> >>>>>>>>>>>> If you make your key temporal, that is adding time as a >>>>>>>>>>>> component >>>> of >>>>>>>>>>>> your >>>>>>>>>>>> key, you will end up creating new rows of data while the old >>>>>>>>>>>> row >>>>>>>>>>>> still >>>>>>>>>>>> exists. >>>>>>>>>>>> Not a good side effect. >>>>>>>>>>>> >>>>>>>>>>>> The other nasty side effect of using time as your key is that >>>>>>>>>>>> you >>>>>>>>>>>> not >>>>>>>>>>>> only >>>>>>>>>>>> have the potential for hot spotting, but that you also have the >>>>>>>>>>>> nasty >>>>>>>>>>>> side >>>>>>>>>>>> effect of creating splits that will never grow. >>>>>>>>>>>> >>>>>>>>>>>> How often are you going to ask to see the files where they were >>>> not >>>>>>>>>>>> updated >>>>>>>>>>>> in the last couple of days/minutes? If its infrequent, then you >>>>>>>>>>>> really >>>>>>>>>>>> should care if you have to do a complete table scan. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Jun 14, 2012, at 5:39 AM, Jean-Marc Spaggiari wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Wow! This is exactly what I was looking for. So I will read >>>>>>>>>>>>> all >>>> of >>>>>>>>>>>>> that >>>>>>>>>>>>> now. >>>>>>>>>>>>> >>>>>>>>>>>>> Need to read here at the bottom: >>>>>>>>>>>>> https://github.com/sematext/HBaseWD >>>>>>>>>>>>> and here: >>>>>>>>>>>>> >>>> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/ >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> JM >>>>>>>>>>>>> >>>>>>>>>>>>> 2012/6/14, Otis Gospodnetic <[email protected]>: >>>>>>>>>>>>>> JM, have a look at https://github.com/sematext/HBaseWD (this >>>> comes >>>>>>>>>>>>>> up >>>>>>>>>>>>>> often.... Doug, maybe you could add it to the Ref Guide?) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Otis >>>>>>>>>>>>>> ---- >>>>>>>>>>>>>> Performance Monitoring for Solr / ElasticSearch / HBase - >>>>>>>>>>>>>> http://sematext.com/spm >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> ________________________________ >>>>>>>>>>>>>>> From: Jean-Marc Spaggiari <[email protected]> >>>>>>>>>>>>>>> To: [email protected] >>>>>>>>>>>>>>> Sent: Wednesday, June 13, 2012 12:16 PM >>>>>>>>>>>>>>> Subject: Timestamp as a key good practice? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I watched Lars George's video about HBase and read the >>>>>>>>>>>>>>> documentation >>>>>>>>>>>>>>> and it's saying that it's not a good idea to have the >>>>>>>>>>>>>>> timestamp >>>>>>>>>>>>>>> as >>>>>>>>>>>>>>> a >>>>>>>>>>>>>>> key because that will always load the same region until the >>>>>>>>>>>>>>> timestamp >>>>>>>>>>>>>>> reach a certain value and move to the next region >>>> (hotspotting). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have a table with a uniq key, a file path and a "last >>>>>>>>>>>>>>> update" >>>>>>>>>>>>>>> field. >>>>>>>>>>>>>>> I can easily find back the file with the ID and find when it >>>> has >>>>>>>>>>>>>>> been >>>>>>>>>>>>>>> updated. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> But what I need too is to find the files not updated for >>>>>>>>>>>>>>> more >>>>>>>>>>>>>>> than >>>>>>>>>>>>>>> a >>>>>>>>>>>>>>> certain period of time. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If I want to retrieve that from this single table, I will >>>>>>>>>>>>>>> have >>>> to >>>>>>>>>>>>>>> do >>>>>>>>>>>>>>> a >>>>>>>>>>>>>>> full parsing of the table. Which might take a while. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So I thought of building a table to reference that (kind of >>>>>>>>>>>>>>> secondary >>>>>>>>>>>>>>> index). The key is the "last update", one FC and each column >>>> will >>>>>>>>>>>>>>> have >>>>>>>>>>>>>>> the ID of the file with a dummy content. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> When a file is updated, I remove its cell from this table, >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>> introduce a new cell with the new timestamp as the key. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> And so one. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> With this schema, I can find the files by ID very quickly >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>> I >>>>>>>>>>>>>>> can >>>>>>>>>>>>>>> find the files which need to be updated pretty quickly too. >>>>>>>>>>>>>>> But >>>>>>>>>>>>>>> it's >>>>>>>>>>>>>>> hotspotting one region. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> From the video (0:45:10) I can see 4 situations. >>>>>>>>>>>>>>> 1) Hotspotting. >>>>>>>>>>>>>>> 2) Salting. >>>>>>>>>>>>>>> 3) Key field swap/promotion >>>>>>>>>>>>>>> 4) Randomization. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I need to avoid hostpotting, so I looked at the 3 other >>>> options. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I can do salting. Like prefix the timestamp with a number >>>> between >>>>>>>>>>>>>>> 0 >>>>>>>>>>>>>>> and 9. So that will distribut the load over 10 servers. To >>>>>>>>>>>>>>> find >>>>>>>>>>>>>>> all >>>>>>>>>>>>>>> the files with a timestamp below a specific value, I will >>>>>>>>>>>>>>> need >>>> to >>>>>>>>>>>>>>> run >>>>>>>>>>>>>>> 10 requests instead of one. But when the load will becaume >>>>>>>>>>>>>>> to >>>> big >>>>>>>>>>>>>>> for >>>>>>>>>>>>>>> 10 servers, I will have to prefix by a byte between 0 and >>>>>>>>>>>>>>> 99? >>>>>>>>>>>>>>> Which >>>>>>>>>>>>>>> mean 100 request? And the more regions I will have, the more >>>>>>>>>>>>>>> requests >>>>>>>>>>>>>>> I will have to do. Is that really a good approach? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Key field swap is close to salting. I can add the first few >>>> bytes >>>>>>>>>>>>>>> from >>>>>>>>>>>>>>> the path before the timestamp, but the issue will remain the >>>>>>>>>>>>>>> same. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I looked and randomization, and I can't do that. Else I will >>>> have >>>>>>>>>>>>>>> no >>>>>>>>>>>>>>> way to retreive the information I'm looking for. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So the question is. Is there a good way to store the data to >>>>>>>>>>>>>>> retrieve >>>>>>>>>>>>>>> them base on the date? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> JM >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >>> >> > >
