Hi Adrian,
Thanks for your reply. I'm more looking for software that will help with
the flow, not necessarilly information on how to properly design software
from scratch. But looks like we may do this from scratch in the end.
Thanks!
-Matt
On Tue, 29 Dec 2009, Adrian Noland wrote:
If I understand your problem, then this may help:
http://www.devshed.com/c/a/PHP/Introducing-the-Strategy-Pattern/
http://www.fluffycat.com/PHP-Design-Patterns/Strategy/
I would use it something like this:
$this->data = new DataStream($source);
$bool = new BasicValidate($this->data);
$bool = new DataMapper($this->data);
$bool = new LayoutBuilder($this->data);
if($bool) {
$result = MysqlQuery($data);
}
Adrian
On Tue, Dec 29, 2009 at 11:46 AM, Matt Juszczak <m...@atopia.net> wrote:
Hi Greg,
That's pretty much how we had things mapped out. Basic source validation
in the beginning, then a more
standard validation later on. I guess my question was - is there any
open source project that would help
with this work flow, such as handling exceptions in the case of bad data,
etc. etc.?
-Matt
On Tue, 29 Dec 2009, Greg Rundlett (freephile) wrote:
There are several issues all bundled up here.
First you have to define the file format(s) that your feeds come in so
that you can read them and isolate the data elements correctly. Are
they CSV, TSV, XML, custom?
At the first (read) step you can do gross validation like validate the
source for correct character encoding, record separators, record
length etc.
You also want to define "layouts" for various feeds. By defining the
layout, you can apply a set of processing instructions to transform
the received layout into the desired layout.
Once you have the data in the desired layout, you can then do finer
validation such as per field rules. E.g. last name must
/(a-zA-Z'-){2,100}/
Then you can write an import routine that determines the correct
layout transformation to apply, reads the source, tranposes it,
validates it, stores it in the database, logs the event
hth,
Greg Rundlett
nbpt 978-225-8302
m. 978-764-4424
-skype/aim/irc/twitter freephile
http://profiles.aim.com/freephile
On Tue, Dec 29, 2009 at 12:28 PM, Matt Juszczak <m...@atopia.net> wrote:
Hi all,
I need to write a tool that somehow takes multiple data feeds (some
xls
files, some csv files, etc.), defines each format,
verifies/validates/sanitizes the data, and imports it into
standardized
MySQL tables.
As an example, one feed might look like this:
First Name, Last Name, DOB
And another like this:
Name, DOB
And another:
DOB, Last Name, First Name, Middle Initial
I need to somehow be able to say "For this source, this column
needs to be
split into two, and this other column is the date of birth column".
Once
the data is parsed, it would need to be imported into standard
MySQL tables:
first_name, last_name, birth_date, etc.
This is a simple example - the files and tables actually have 20-30
columns
each, so using mysqlimport or a simple shell script probably isn't
the best
option.
Are there open source tools out there that do this already? If
not, should
I just build something?
Thanks,
-Matt
_______________________________________________
New York PHP Users Group Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk
http://www.nyphp.org/Show-Participation
_______________________________________________
New York PHP Users Group Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk
http://www.nyphp.org/Show-Participation
_______________________________________________
New York PHP Users Group Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk
http://www.nyphp.org/Show-Participation
_______________________________________________
New York PHP Users Group Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk
http://www.nyphp.org/Show-Participation