Re: [Newbies] Re: conceptual design help

Offray Vladimir Luna Cárdenas Fri, 29 Apr 2016 09:00:14 -0700

Hi Joseph,

I'm making some data visualizations and despite of not having an adviceon conceptual design, I share part of the practical problem of having towork with CSV values in a Smalltalk environment and some times with alot of records (my recent project works with 270k of them). Thevisualization I did was documented broadly at [1], but essentially Icreate a "PublishedMedInfo class >> loadDataFromCSV: aFileusingDelimiter: aCharacter" method that fill out my domain objects thatcame from an excel (and then CSV) file.


[1] http://mutabit.com/offray/blog/en/entry/sdv-infomed

For my recent project [2] I'm using a SQLite bridge between Pharo andthe imported data from CVS. In that way I'm delegating storage andquerying (including duplicates) to a small but potent database back-end,while using objects to model "higher" concerns of my domain. I know someworries about objects-database mismatch impedance, but working with dataand its visualization/reporting lets you to build bridges leveraging theformer to the database and the last to objects, while using thestrengths of each one in their own place.


[2] https://twitter.com/offrayLC/status/725314838696701957

So my practical advice is to explore this kinds of combination early inyour design. May be a quick hands on mockup could let you know if itworks for you. In my case it has and I'm implementing it sooner in myprojects.


Cheers,

Offray

Ps: Long time without writing, but I have been reading constantly. Niceto be "back" :-)


On 29/04/16 09:28, Joseph Alotta wrote:

Thanks for all the help.
I like the idea of having the code sense the format of the data andacting accordingly.
For separators, I could count the number of each kind of separators inthe file and compare it to the number of lines. Say 3 or moreseparators per line.
Then I can parse by columns and look for the dominant data type. Fora column that is 60% matching a date type, I can assume it is a datecolumn and the mismatches are headers.
The amount should be numeric.

The payee should be mostly letters, etc.
One issue I have is knowing what to call the object that does this.It would not be a Transaction, because this is a function of manyTransactions.
FileLoader?  FileAnalyzer?

Also, at this point I should be looking for missing dates and duplicates.
Duplicates are troublesome, since everytime I download the file, itstarts from the beginning of the year again. I keep downloading thembecause I think they will only keep data for 6 months or so.
Also duplicate transactions are valid. Suppose I go into a coffeeshop and buy a cup of coffee, then go back the same day, same storefor a refill.
Your thoughts?

Sincerely,

Joe.



------------------------------------------------------------------------
View this message in context: Re: conceptual design help<http://forum.world.st/conceptual-design-help-tp4892763p4892966.html>Sent from the Squeak - Beginners mailing list archive<http://forum.world.st/Squeak-Beginners-f107673.html> at Nabble.com.
_______________________________________________
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners

_______________________________________________
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Re: [Newbies] Re: conceptual design help

Reply via email to