Apologies Andrew, I misread and thought you provided the data files for use in improving wee_import WD imports rather than seeking help with the import. Other things have taken up my time, I will try to have a look Friday.
Gary On Thursday, 17 September 2020 at 08:39:49 UTC+10 Andrew M wrote: > Bumping this post to see if anyone has any suggestions on my the attached > is failing import. > > Thank you in advance. > > > > On Wednesday, September 9, 2020 at 11:02:01 PM UTC-4, Andrew M wrote: >> >> >> I do appreciate the assistance. Hopefully this will help others in either >> with what I am doing or identifying a problem with the import process. >> >> I have attached a copy of one Weather Display log file I am having issues >> with (82013lg.txt.zip) as well as one I did not have any issues with >> (52013lg.txt.zip). >> Also attached is output from wee_debug >> >> I am not seeing any differences in the data between these files that is >> causing 82013lg.txt to choke. >> >> ~20 files out of ~400 files had an issue. >> >> Please let me know if any additional information is needed. >> >> >> >> From running wee_import on 82013lg.txt >> pi@weather:/var/tmp $ wee_import --import-config=/var/tmp/wd.conf >> --dry-run --verbose >> Using WeeWX configuration file /etc/weewx/weewx.conf >> Starting wee_import... >> Weather Display monthly log files in the '/var/tmp/WD' directory will be >> imported >> The following options will be used: >> config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf >> from=None, to=None >> dry-run=True, calc_missing=False, ignore_invalid_data=True >> monthly logs are in US units >> tranche=300, interval=60 >> UV=True, radiation=True ignore extreme temperature and humidity=True >> Using database binding 'wx_binding', which is bound to database >> 'roundhillvaw_archive' >> Destination table 'archive' unit system is '0x01' (US). >> This is a dry run, imported data will not be saved to archive. >> Starting dry run import ... >> Records covering multiple periods have been identified for import. >> Obtaining raw import data for period 1 ... >> Traceback (most recent call last): >> File "/usr/share/weewx/wee_import", line 900, in <module> >> main() >> File "/usr/share/weewx/wee_import", line 830, in main >> source_obj.run() >> File "/usr/share/weewx/weeimport/weeimport.py", line 368, in run >> _raw_data = self.getRawData(period) >> File "/usr/share/weewx/weeimport/wdimport.py", line 568, in getRawData >> for rec in _reader: >> File "/usr/lib/python3.7/csv.py", line 112, in __next__ >> row = next(self.reader) >> _csv.Error: line contains NULL byte >> >> >> >> From log file >> pi@weather:~ $ sudo tail -f /var/log/weewx.log >> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: The >> following options will be used: >> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: >> config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf >> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: >> from=None, to=None >> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: >> dry-run=True, calc_missing=False, ignore_invalid_data=True >> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: >> monthly logs are in US units >> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: >> tranche=300, interval=60 >> Sep 9 22:39:01 weather wee_import[16538] DEBUG weeimport.wdimport: >> UV=True, radiation=True ignore extreme temperature and humidity=True >> Sep 9 22:39:01 weather wee_import[16538] INFO weeimport.wdimport: Using >> database binding 'wx_binding', which is bound to database >> 'roundhillvaw_archive' >> Sep 9 22:39:01 weather wee_import[16538] INFO weeimport.wdimport: >> Destination table 'archive' unit system is '0x01' (US). >> Sep 9 22:39:01 weather wee_import[16538] INFO weeimport.weeimport: >> Obtaining raw import data for period 1 ... >> Sep 9 22:42:24 weather wee_import[16712] INFO __main__: Starting >> wee_import... >> Sep 9 22:42:30 weather wee_import[16712] DEBUG weewx.manager: Daily >> summary version is 2.0 >> Sep 9 22:42:30 weather wee_import[16712] INFO weeimport.wdimport: >> Weather Display monthly log files in the '/var/tmp/WD' directory will be >> imported >> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: The >> following options will be used: >> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: >> config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf >> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: >> from=None, to=None >> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: >> dry-run=True, calc_missing=False, ignore_invalid_data=True >> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: >> monthly logs are in US units >> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: >> tranche=300, interval=60 >> Sep 9 22:42:30 weather wee_import[16712] DEBUG weeimport.wdimport: >> UV=True, radiation=True ignore extreme temperature and humidity=True >> Sep 9 22:42:30 weather wee_import[16712] INFO weeimport.wdimport: Using >> database binding 'wx_binding', which is bound to database >> 'roundhillvaw_archive' >> Sep 9 22:42:30 weather wee_import[16712] INFO weeimport.wdimport: >> Destination table 'archive' unit system is '0x01' (US). >> Sep 9 22:42:30 weather wee_import[16712] INFO weeimport.weeimport: >> Obtaining raw import data for period 1 ... >> Sep 9 22:42:36 weather wee_import[16714] INFO __main__: Starting >> wee_import... >> Sep 9 22:42:42 weather wee_import[16714] DEBUG weewx.manager: Daily >> summary version is 2.0 >> Sep 9 22:42:42 weather wee_import[16714] INFO weeimport.wdimport: >> Weather Display monthly log files in the '/var/tmp/WD' directory will be >> imported >> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: The >> following options will be used: >> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: >> config=/etc/weewx/weewx.conf, import-config=/var/tmp/wd.conf >> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: >> from=None, to=None >> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: >> dry-run=True, calc_missing=False, ignore_invalid_data=True >> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: >> monthly logs are in US units >> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: >> tranche=300, interval=60 >> Sep 9 22:42:42 weather wee_import[16714] DEBUG weeimport.wdimport: >> UV=True, radiation=True ignore extreme temperature and humidity=True >> Sep 9 22:42:42 weather wee_import[16714] INFO weeimport.wdimport: Using >> database binding 'wx_binding', which is bound to database >> 'roundhillvaw_archive' >> Sep 9 22:42:42 weather wee_import[16714] INFO weeimport.wdimport: >> Destination table 'archive' unit system is '0x01' (US). >> Sep 9 22:42:42 weather wee_import[16714] INFO weeimport.weeimport: >> Obtaining raw import data for period 1 ... >> >> >> >> >> >> >> On Tuesday, September 1, 2020 at 7:12:29 PM UTC-4, gjr80 wrote: >>> >>> Didn’t take your post as a complaint, sorry if I came across in a manner >>> that gave that impression. Just wanted to point out that development of the >>> WD module was based on a very small sample of data rather than from some >>> written specification. If the problem was a malformed data file that caused >>> wee_import to abort we may be able to harden wee_import skip the malformed >>> lines. If the problem is a new format/structure in the data file that >>> wee_import rejected then we may need to rework the WD module to handle this >>> new format/structure. >>> >>> If you want to send any relevant log entries/errors and a copy of a >>> misbehaving data file by direct email thats fine by me. >>> >>> Gary >>> >>> On Wednesday, 2 September 2020 at 08:15:51 UTC+10 Andrew M wrote: >>> >>>> Please don't take what i have written as a complaint. >>>> I appreciate all the hard work it took in the creation of WeeWx and all >>>> the contributors to the forum. >>>> >>>> I have not looked at the content of those files it choked on to see if >>>> it has good data in it. Let me first do that. I will reply back once I >>>> have >>>> done that. >>>> >>>> Thank you. >>>> >>>> andrew >>>> >>>> On Tuesday, September 1, 2020 at 5:20:30 PM UTC-4 gjr80 wrote: >>>> >>>>> If you to provide some details on problems the handful of files >>>>> experienced I am happy to look at wee_import to see if any changes can be >>>>> made to improve its handling of such files. The WD import module of >>>>> wee_import was developed based on a handful of WD log files found on the >>>>> internet, so it is quite possible there are some corner cases that may >>>>> cause wee_import to reject a file. >>>>> >>>>> Gary >>>>> >>>>> On Wednesday, 2 September 2020 at 07:08:20 UTC+10 Andrew M wrote: >>>>> >>>>>> After many, many hours the wee-import processed all but a handful >>>>>> of files. Have several that choked on the import. >>>>>> Ended up with 3,873,678 records. >>>>>> >>>>>> Next I need to validate that the data imported to look for any >>>>>> potential bad data. >>>>>> >>>>>> >>>>>> >>>>>> On Friday, August 21, 2020 at 7:31:36 PM UTC-4 Andrew M wrote: >>>>>> >>>>>>> Thank you for the response. >>>>>>> >>>>>>> I actually changed up how I am going about this. My WD ran on a >>>>>>> Windows10 box. I had a RaspPi box sitting in a box so i decided to use >>>>>>> that >>>>>>> for WeeWx. >>>>>>> I was formatting the hard drive on the Win10 machine when i had the >>>>>>> thought that I should just put Debian on that and use that for WeeWx. >>>>>>> Which I did. This machine is a little faster that the RaspPi , so >>>>>>> one I have other things straight I will use that for the conversion. >>>>>>> Migrating the MySql from WD to the tables that WeeWX has is probably >>>>>>> doable, but really don't want to sit down and think about it that much. >>>>>>> I >>>>>>> will just let the current weather collect on the RaspPi box and then >>>>>>> once >>>>>>> the import is done on the now Debian box I can combine the two DB much >>>>>>> easier. >>>>>>> >>>>>>> I have 60k rows on the MySQL DB with my current webhost, but more >>>>>>> than that saved external to that on a hard drive from a previous >>>>>>> webhost >>>>>>> that I never moved. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wednesday, August 19, 2020 at 8:37:29 PM UTC-4 gjr80 wrote: >>>>>>> >>>>>>>> Sorry can't help you with MySQL to MySQL migration. >>>>>>>> >>>>>>>> Regards wee_import though, yes it can be slow. When I wrote the WD >>>>>>>> import module the WD user who first used it in anger had something >>>>>>>> like >>>>>>>> (from memory) 10 years of data to import. The import had to be done in >>>>>>>> batches (again from memory) of 2-3 years each (wee_import uses >>>>>>>> transactions on the database but does keep track of duplicate >>>>>>>> timestamps >>>>>>>> and a few other things so memory usage does grow as the span of the >>>>>>>> import >>>>>>>> grows). I found one email from the user with the results of the first >>>>>>>> batch >>>>>>>> import and 1.4 millions records were imported on a Raspberry Pi in 58 >>>>>>>> minutes. Al things considered I find that reasonable. >>>>>>>> >>>>>>>> There are a couple of things you can do to speed up wee_import. >>>>>>>> You can tweak a the tranche setting in the import config file, >>>>>>>> this alters the size of the transactions (in records) that wee_import >>>>>>>> uses. >>>>>>>> The default is 250, you could raise this which will result in fewer db >>>>>>>> transactions but it will likely increase memory usage so you may need >>>>>>>> to do >>>>>>>> the import in smaller batches. One other approach if using a slow(ish) >>>>>>>> RPi >>>>>>>> as your WeeWX machine is to do just the import on a faster machine and >>>>>>>> then >>>>>>>> copy the imported data to the WeeWX RPi. Granted this is simpler when >>>>>>>> using >>>>>>>> SQLite but depending on your setup could be adapted for MySQL. >>>>>>>> >>>>>>>> You say you have 60 000 odd MySQL records, that does not seem like >>>>>>>> much, how does that correlate with the number of entries in the WD log >>>>>>>> files? >>>>>>>> >>>>>>>> Gary >>>>>>>> >>>>>>>> On Thursday, 20 August 2020 08:48:46 UTC+10, Andrew M wrote: >>>>>>>>> >>>>>>>>> I started to use the wee_import process to process all my WD log >>>>>>>>> files to WeeWx MySqlDB, and it was taking a long time. It seems like >>>>>>>>> If I >>>>>>>>> have many years of data it will take that long to import them into >>>>>>>>> WeeWx >>>>>>>>> MySqlDB. >>>>>>>>> >>>>>>>>> I then thought, oh wait, i already have the WD data in a MySql DB, >>>>>>>>> so why am i doing this process. >>>>>>>>> >>>>>>>>> Now I have to figure out how I can gracefully import all the WD >>>>>>>>> data I have in a MySQL DB to the one I set up for WeeWx. Both are on >>>>>>>>> the >>>>>>>>> same hosted server. Different DB names. >>>>>>>>> >>>>>>>>> There is one table for WD and multiple tables for WeeWx so have >>>>>>>>> no idea on where to begin with this. I have ~60,000 rows of data in >>>>>>>>> the WD >>>>>>>>> MySQL DB table. >>>>>>>>> >>>>>>>>> Does anyone have a graceful way of migrating the WD MySQL DB into >>>>>>>>> a WeeWX DB? >>>>>>>>> >>>>>>>>> Am I overlooking something in the documentation or in group? >>>>>>>>> >>>>>>>>> Thank you. >>>>>>>>> >>>>>>>>> >>>>>>>>> -- You received this message because you are subscribed to the Google Groups "weewx-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/a1fa6599-3ea1-4d29-86ff-87d9cc68437fn%40googlegroups.com.
