Just got the Stats for February, 2024 and when the ProFox List Statistics
for January is one of the top posts for the month of February, I realized
I've been busy to comment and I assume everyone else has been busy too, so
consider this a kick-off post for March to help improve or at least
stimulate some conversation in here!

 

So, last month, I took the plunge and purchased a copy of VFPA 10.2. I have
a software app I wrote a long time ago to help telemarketers comply with the
Telemarketing Sales Rule by scrubbing their call lists against the National
Do Not Call List sold by the Federal Trade Commission. The program (which I
creatively named Call List Scrubber) was originally written in VFP6 and just
like VFP, it is a mature and stable product requiring little maintenance, is
easy to use and is easy to support. 

 

That having been said, the full National Do Not Call List has grown to
almost 3Gb in size and while we changed several years ago from using FREADs
to using the Scripting.FileSystemObject, up until now we could import and
store the information in a DBF and didn't exceed 2Gb by using several tricks
not the least of which was storing the area codes as a 3 byte text field and
the 7 digit number as an 4byte integer. Oh, and I built the original program
to use IDX files instead of compound NDX files so that was another trick
that helped us over the years as we dealt with the unceasing growth in the
National Do Not Call List.

 

While most of our users only purchase individual area code DNC lists from
the FTC, in January we had a new client who did purchase the full national
list and you guessed it, Call List Scrubber could no longer import the full
National DNC list and function. So, I helped the client split the file into
two roughly equal portions, set up two installs for them and they are able
to function and use the product for now. But..That meant I needed to find
some alternative and permanent solution. One choice and the one that most of
you would probably recommend would be for me to upsize to a SQL database or
perhaps use SQLite. But that would require almost a full re-write of the
product and I don't really have time for that at the moment and more
importantly, I don't have the motivation to do a full re-write at this time
(perhaps later), so I started casting about for solutions and VFPAdvanced
came to mind. 

 

I reached out to Chen via email and asked a few questions about how to
implement this product which he graciously answered and I ultimately
purchased version 10.2 from him to test and to play with to see if the large
file fix would work in my application. The installation went very smoothly
after I upgraded my installation per his instructions. I easily set up the
program to run side by side with VFP9 and I have continued working in VFP9
for the majority of my work and I simply setup a new startup shortcut to use
when I want to use VFPA10 instead. Works completely as advertised and does
not crossover into my standard installation in any way. 

 

Now that I've completed my initial testing I'd like to share some of my
results and invite some comments:

 

First, to answer the obvious question: Yes, when I use VFPA, I can import
the full National DNC file with 250Million + telephone numbers into my
program without any other changes beyond recompiling with VFPA 10.2. After
importing the NDNC text file (~3Gb) and stripping off the not used
characters like commas, spaces, etc; the resulting finished DBF is 2.004Gb,
the Area Code IDX is 1.083Gb and the full 10 digit number index is 1.750GB. 

 

Second, I did several tests with VFPA and found if I split the list into 10
separate area code files (0-100, 100-200, 200-300, etc), it took about twice
as long to import and populate those tables as it did to import the full
file into one DBF. Though I may need to do more testing on this idea but it
seems like a convoluted solution to me so I didn't go too far with it. 

 

Third, the best performance came from a buffered import where I import the
text data in batches. Each record in the NDNC file is 12 bytes long and
looks like this: "999,9999999LF", so using the Scripting.FileSystemObject,
I'm importing 12^6 characters (248,832 records at a time) and then I use
ALINES to convert the buffered characters to an array and then insert each
of the array "rows" into the DBF. This method allows me to import a grand
total of 250 Million telephone numbers on the Do Not Call List, populate a
DBF in excess of 2GB and update two IDX files at the same time in just under
4 hours. Some of that work was done across my internal network so I might be
able to pick up some extra time by running solely on an SSD, but if someone
has other ideas of how to import, populate and update IDX files any faster,
please let me know. 

 

Fourth, VFPA performs very well in my tests. I have found nothing that isn't
compatible with the original VFP6 code and I suspect I won't find any major
differences in VFP9 code because I'm using both VFP6 and VFP9 class
libraries in my tests and everything works as expected. Chen has been
responsive to my questions and released an update even as I was writing
this. I'm not sure I'm ready to cut over to VFPA for my primary development
work, but this is looking like a great option in this particular situation. 

 

Finally, there is one issue that concerns me at this point and that is with
memory usage. During an import test like the ones I've described here, I see
memory usage by VFP and the Large Memory module that Chen provides grow from
the 500Mb I usually see when running one of my VFP applications to 2 to 2.5
GB. I have a suspicion this is caused by the two IDX files I have open while
doing the import. I'm opening the IDX files with the DBF file so that they
are updated as the data is inserted into the DBF. This may not be the best
way to do things because perhaps recreating the indexes after the import is
complete may be faster, BUT, I'm not sure how to provide user feedback on
the progress of the indexing process and maybe I just need someone to guide
me on this point. I'll have to test doing the import without the IDX files
open to confirm my thoughts about the memory usage during the import, but
the most important issue is that the memory that gets allocated during the
import process isn't released when I close the table and the indexes. In
fact, it isn't released at all until I close VFPA and restart it. Does
anyone have an insight into this memory usage weirdness I'm seeing? 

 

Ok, I know this is a long email, but I'm trying to start some conversation
and there are several points I think y'all can help me with so have at it. 

 

Any feedback is helpful for this situation let me know what you know!

 

Paul H. Tarver 
Tarver Program Consultants, Inc. 



 



--- StripMime Report -- processed MIME parts ---
multipart/alternative
  text/plain (text body -- kept)
  text/html
---

_______________________________________________
Post Messages to: ProFox@leafe.com
Subscription Maintenance: https://mail.leafe.com/mailman/listinfo/profox
OT-free version of this list: https://mail.leafe.com/mailman/listinfo/profoxtech
Searchable Archive: https://leafe.com/archives
This message: 
https://leafe.com/archives/byMID/045a01da6c09$6ec1a1a0$4c44e4e0$@tpcqpc.com
** All postings, unless explicitly stated otherwise, are the opinions of the 
author, and do not constitute legal or medical advice. This statement is added 
to the messages for those lawyers who are too stupid to see the obvious.

Reply via email to