Hi, I don't often post here, but I will add my ~30 years of experience with binary file formats. THEY ARE EVIL EVIL THINGS !!
Binary file formats made a lot of sense back in the days of
They do make sense when used internally to a compression
algorithm or an encryption scheme. For a format like .png binary is not so bad, the format is unlikely to change much if ever (even so there are at least 6 different png formats out there, some of them ascii and some binary), but it is a simple format to describe and implement. Any program that works with png's needs to implement support for all versions of png, or clearly explain to the user what is and is not supported with appropriate error messages. Any png programs that don't support all versions tend to fall into disuse and die a slow death. Text based formats on the other hand, if well designed, are (to a fair extent)
I'll climb down from my soapbox now, but I hope I have assisted people's understanding of Wayne's refusal to use binary formats. Regards David G On 19/12/15 05:46, Wayne Stambaugh
wrote:
Mark, I haven't had time to look at your commit so I'm not going to comment on that until I do. I don't know when I'll have time as I have a lot of stuff to do and I'm traveling over the holidays so my review time will be limited. In the mean time I will comment on some of the things I've read in this thread.Binary file formats are not going happen while I'm project leader. I have 30 years of less than positive experience with them and I'm not about to start using them now. Be careful with the C++ << operators. They may not do the correct thing with floating point numbers. There may also be some overhead that the current design does not have. I will not accept a performance hit on file parsing or any significant increase in formatted file size. On 12/17/2015 10:32 PM, Mark Roszko wrote:So awhile back, Wayne said to use sexpr for something I wanted to do. Then I looked at the sexpr parsing and said NOPE. Why NOPE? Because the current parsing regime is basically Visual Basic 6 parser written in a modern language with micro-optimizations meant for someone running a Windows 3.1 computer with 5.25" storage drives. This patch is a propsed sexpr parser that parses sepxr like its sexpr and not a parenthesis format. Because if you are parsing with things like NeedLeft() and NeedRight(), you are parsing it wrong. Especially when you see something like this: case T_descr: if( sawDesc ) in->Duplicate( tok ); sawDesc = true; in->NeedSYMBOLorNUMBER(); row.SetDescr( in->FromUTF8() ); break; but then you realize that fp_lib_table has quoted strings for MOST descr entries. And you question why the method is called NeedSYMBOLorNumber when its a string with quotes. Then you go deeper and realize the parser may as well be a csv reader/writer. You either parse like sexpressions or you stop calling it sexpressions. New proposed system: 1. Reads and generates an in memory tree structure of data as it should be, i.e. lists, strings, numbers, etc.I'm ok with the memory tree structure as long as it doesn't add any significant parsing or conversion to internal object overhead.2. Helpers to pull out each item as need beOK3. Backwards compatibleBackwards compatibility is not optional.4. Doesn't do silly keywords micro optimization at compile time. You do a string comparison to convert the value to integer anyway, using if/else is no different each time. Kicad isn't parsing gigabyte sized files nor hundreds of files, this optimization really isn't worth the overhead in maintenance.This change will probably kill your performance as the file format gets new features which it will. I could be wrong but there is not going to be much faster token look up than integers. Before you decide this optimization isn't worth it, you need to get an 8 layer+ high density board file from someone and do some testing.5. Generate saved files from in memory tree structures, this will avoid all possible formatting irregularities and differences because someone handwrote unrolling all the data members.I need to see how your doing this because keeping both the memory tree and the board objects in memory doesn't make a lot of sense to me.6. Avoid things like " ${KIGITHUB}/Air_Coils_SML_NEOSID.pretty" being defined as a symbol instead of a string in the future. 7. Explicit definitions of symbols and strings. Strings are always quoted. Period. No silly auto-detection logic. So my first goal is to have 1:1 parity with the existing stuff for kicad files for both reading and writing. Benchmarks: Old fp-lib-table read: 1ms New fp-lib-table read: 2ms Here's the actual commit for fp-lib-table: https://github.com/marekr/kicad-sexpr/commit/9367f469be69962d14671411eddd6fd759ace1f2 Not expecting anyone to compile it or anything, more input that anything. Yes its messy as its an initial proof of concept. Yes, string parsing and writing isn't escaping properly, TBD (easy, just lazy). There would probably be a smaller kicad_sexpr wrapper to implement common sexpr pattern helpers such as the "list key-value pair" that's used. Also played around with .kicad-pcb but its not committed: Old read: ~350ms New read: ~230ms I await the abuse. _______________________________________________ Mailing list: https://launchpad.net/~kicad-developers Post to : kicad-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~kicad-developers More help : https://help.launchpad.net/ListHelp_______________________________________________ Mailing list: https://launchpad.net/~kicad-developers Post to : kicad-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~kicad-developers More help : https://help.launchpad.net/ListHelp |
_______________________________________________ Mailing list: https://launchpad.net/~kicad-developers Post to : kicad-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~kicad-developers More help : https://help.launchpad.net/ListHelp