On 05/12/2010 04:18 PM, esquifit wrote:
On 11 Maig, 20:31, Tim Chase<[email protected]>  wrote:
- Every space that's in a line must be counted, placed upfront the
line, and by the number + 1 needs to be done.

You can use the following:

    :%s/.*/\=strlen(substitute(submatch(0), '\S\+', '', 'g')).'
'.submatch(0)

to prepend the space-counts.

How would be the performance of such a formula when processing a big
file? I came upon another more or less obvious solution -again, with
some simplifications like not making distinction among spaces and
tabs. I'm using a macro:

yyP:s/\S//g<CR>"=col('$')<CR>pJ

Just a little back-of-the-envelope thinking:

- copy the line into the scratch register, purging out the previous contents of the previous scratch register, updating the "0" (yank) register: O(1)

- switch to command-line mode: O(1)

- perform a substitute across the line: O(len(line))

- switch back to normal mode: O(1)

- switch into the expression-register entry mode:  O(1)

- pull the length of the line: O(1)

- switch back to normal mode

- insert the preserved contents of the line: O(1)

- join the two lines: either O(1) or O(len(line))

I run this macro on all lines (99999 times or so)

So this comes out to

  ((k + len(line)) * number_of_lines)

where k = "mode-switching" time + "copying to 2 registers" time + "inserting a line time" + "joining a line" time + "removing the leading space with your following substitute" time

Recording the macro (and burning a register to contain the macro), ensuring that your "99999" covers sufficient lines (if you choose fewer than the actual number of lines, you have to re-execute your macro) also takes a bit more time.

and at the end I suppress the leading spaces with

:%s/\s//

and then doing a second pass on the entire file is O(number_of_lines) which adds into the above sum.

Without having made any benchmarking, I *suspect* that this can be
quicker than using strlen, and submatches.  I'd like to hear your
opinion about this.

Using my suggestion is a one-pass (touches each line once and only once), performing the substitute on it as it's touched, finding the length (may be O(1) or O(len) depending on implementation) and joining the space-count together with the line content. If speed is important, you might be able to tweak my original to

 :%s/^/\=strlen(substitute(getline('.'), '\S\+', '', 'g')).' '

which terminates the initial search regexp and does a little less work to get the results. Both my original solution and my 2nd suggestion have additional benefits of

- not switching between various modes multiple times
- not tromping the contents of your scratch register
- not tromping the contents of your "0" (yank) register
- not burning a register for the macro
- not having to guess how many executions to replay

Feel free to benchmark if it matters to you :)

It just goes to show that Vim accommodates a variety of solutions and has plenty of room for tweaking solutions if one doesn't work for you.

-tim


--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

Reply via email to