Re: regex replace with match
striker wrote: I have a large fixed width database file that I would like to delimit with commas. For example here are 2 lines of the file: 210044012123540759F181012004103C14 29847.3741091 4280 5070 42789 28529 2769 2449 3320 2948 05121 210044012112140906F091012004101J 2 11048.495609 5559 9973 5180 9974 5680 94572 5451 06148 length of field1 = 8 length of field2 = 2 length of field3 = 3 and so on through 35 fields. I was going to use a simple regex replacement like::%s/\(.\{8}\)\ (.\{2}\)\(.\{3}\)/\1,\2,\3,/g This does work when only replacing a small number of fields. I get 2 errors when I source a file with the command for all 35 fields. The errors are: E51: Too many \( E476: Invalid command Two questions because of all of this: 1) What is the limit of \( that can be used? 2) Is there a better way of delimiting the file? Using the vis.vim plugin (available from http://mysite.verizon.net/astronaut/vim/index.html#VimFuncs as "Visual Block Commands"): Position cursor, line 1, on a delimiting column. ctrl-v$:B s/ /,/ Repeat with other columns. Regards, Chip Campbell
RE: regex replace with match
>I have a large fixed width database file that I would like to delimit >with commas. For example here are 2 lines of the file: ... >***note*** Since this is a fixed width data base file, I can import >it into a database as is. I am only doing this as a learning >experience. Just as a fwiw, when I had to do something like this and needed to check my progress through the file to make sure nothing was missed (eg, lines that would fail the pattern-match and be left untouched, fields that didn't conform to certain expectations (eg, variable date-formats), I'd take a more piecemeal approach, eg, the first pass might be :g/\([0-9]*\)\( *\)\(.*\)/s//\1, ###\3/ to "format" the first field and stick a "###" flag/bookmark in there. Second pass would be something like :g/\(.*\)\(###\)\([0-9]*\)\( *\)\(.*\)/s//\1, \3, ###\4/ and so on. This way, if I were to encounter some lines that failed the pattern match and weren't processed, I could simply undo the change and modify the one-step pattern to suit. Don't know if this helps at all, but if an all-or-nothing approach is chancey, or just too long to type in one shot, I'd just throw it at a quickie 'lex' script to process, or just 'sed' my way through it (preferably as a script). With those, you can just edit the regexp that's being used and rerun it with minimal retyping should the pattern fail. Quick example how it *can* fail is back in your sample lines, the lone "14" and "2" bracketted by whitespace. Do you want to grab a variable amount of whitespace and end up with "14" and "2" exactly, or grab a fixed amount of whitespace and end up with "14" and " 2" as 2-digit fields? And if the latter, what if you don't realise there's one line in the file where that field is "102", and the pattern fails to match that line because it's expecting to eat 4 spaces, and there are only 3 between the preceeding number and the "102"? Right tool for the job, and all... ;)
Re: regex replace with match
striker wrote: I have a large fixed width database file that I would like to delimit with commas. For example here are 2 lines of the file: 210044012123540759F181012004103C14 29847.3741091 4280 5070 42789 28529 2769 2449 3320 2948 05121 210044012112140906F091012004101J 2 11048.495609 5559 9973 5180 9974 5680 94572 5451 06148 length of field1 = 8 length of field2 = 2 length of field3 = 3 and so on through 35 fields. I was going to use a simple regex replacement like: :%s/\(.\{8}\)\(.\{2}\)\(.\{3}\)/\1,\2,\3,/g This does work when only replacing a small number of fields. I get 2 errors when I source a file with the command for all 35 fields. The errors are: E51: Too many \( E476: Invalid command Two questions because of all of this: 1) What is the limit of \( that can be used? 2) Is there a better way of delimiting the file? ***note*** Since this is a fixed width data base file, I can import it into a database as is. I am only doing this as a learning experience. Thanks, Kevin You can only refer to nine \( \) pairs by means of \1..\9, so I guess the limit is 9; but you can use \%( \) (which is marginally faster) if you don't need to refer to it in the "replace to" part of the :substitute. Or you can do the substitute in several waves. Best regards, Tony.
Re: regex replace with match
I was going to use a simple regex replacement like::%s/\(.\{8}\)\ (.\{2}\)\(.\{3}\)/\1,\2,\3,/g This does work when only replacing a small number of fields. I get 2 errors when I source a file with the command for all 35 fields. The errors are: E51: Too many \( E476: Invalid command Two questions because of all of this: 1) What is the limit of \( that can be used? 2) Is there a better way of delimiting the file? 1) I believe there's a limit of 9 fields (\1 through \9). 2) perhaps something like :%s/\(\%8c\|\%10c\|\%13c\)/,/g where you specify the columns at which you want to make your replacements would scale to larger numbers of delimitings. It uses absolute offsets (8, 10, 13) rather than relative offsets as you listed (8, 2, 3). You should be able to just keep adding in more columns delimited by the "\|". You can read more at :help /\%c :help /bar -tim