Re: GSoC 2016 | Proposal | Incremental Rewrite of git bisect

2016-03-25 Thread Pranit Bauva
On Fri, Mar 25, 2016 at 5:10 PM, Christian Couder
 wrote:
> On Fri, Mar 25, 2016 at 11:15 AM, Pranit Bauva  wrote:
>>> - you will add an option to "git bisect--helper" to perform what the
>>> git-bisect.sh function did, and
>>> - you will create a test script for "git bisect--helper" in which you
>>> will test each option?
>>
>> I had very initially planned to do this. But Matthieu pointed out that
>> it would be much better to use the existing test suite rather than
>> creating one which can lead to less coverage.
>
> Ok, then perhaps:
>
> - you will add tests to existing test scripts, so that each "git
> bisect--helper" option is (indirectly) tested.
Yes. I will mention this in the proposal also. Thanks for reminding.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GSoC 2016 | Proposal | Incremental Rewrite of git bisect

2016-03-25 Thread Christian Couder
On Fri, Mar 25, 2016 at 11:15 AM, Pranit Bauva  wrote:
>> - you will add an option to "git bisect--helper" to perform what the
>> git-bisect.sh function did, and
>> - you will create a test script for "git bisect--helper" in which you
>> will test each option?
>
> I had very initially planned to do this. But Matthieu pointed out that
> it would be much better to use the existing test suite rather than
> creating one which can lead to less coverage.

Ok, then perhaps:

- you will add tests to existing test scripts, so that each "git
bisect--helper" option is (indirectly) tested.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GSoC 2016 | Proposal | Incremental Rewrite of git bisect

2016-03-25 Thread Pranit Bauva
> - you will add an option to "git bisect--helper" to perform what the
> git-bisect.sh function did, and
> - you will create a test script for "git bisect--helper" in which you
> will test each option?

I had very initially planned to do this. But Matthieu pointed out that
it would be much better to use the existing test suite rather than
creating one which can lead to less coverage.

Thanks,
Pranit Bauva
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GSoC 2016 | Proposal | Incremental Rewrite of git bisect

2016-03-25 Thread Pranit Bauva
On Fri, Mar 25, 2016 at 2:45 PM, Matthieu Moy
 wrote:
> Christian Couder  writes:
>
>> On Thu, Mar 24, 2016 at 12:27 AM, Pranit Bauva  
>> wrote:
>>
>>> Unification of bisect.c and bisect--helper.c
>>>
>>> This will unify the algorithmic and non-algorithmic parts of bisect
>>> bringing them under one heading to make the code clean.
>>
>> I am not sure this is needed and a good idea. Maybe you will rename
>> "builtin/bisect--helper.c" to "builtin/bisect.c" and remove
>> git-bisect.sh at the same time to complete the shell to C move. But
>> the actual bisect.{c,h} might be useful as they are for other
>> purposes.
>
> Yes. My view on this is that builtin/*.c should be just user-interface,
> and actual stuff should be outside builtin, ideally in a well-designed
> and reusable library (typically re-usable by libgit2 or others to
> provide another UI for the same feature). Not all commands work this
> way, but I think this is a good direction to take.

Okay. I didn't know about this. Thanks for completing Christian's point.

>> When you have sent one patch series, even a small one, then your main
>> goal should be to have this patch series merged.
>
> I'd add: to get a patch series merged, two things take time:
>
> 1) latency: let time to other people to read and comment on your code.
>
> 2) extra-work required by reviewers.
>
> You want to send series early because of 1) (then you can work on the
> next series while waiting for reviews on the current one), and you need
> to prioritize 2) over working on the next series to minimize in-flight
> topics.

I had planned to work this way. I will include this in the proposal.
Though it creates some confusion for me and I tend to mix some things
up but I will maintain a hard copy to jot down the discussions and my
thoughts.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GSoC 2016 | Proposal | Incremental Rewrite of git bisect

2016-03-25 Thread Pranit Bauva
On Fri, Mar 25, 2016 at 2:32 PM, Christian Couder
 wrote:
> On Thu, Mar 24, 2016 at 12:27 AM, Pranit Bauva  wrote:
>> Hey!
>>
>> I have prepared a proposal for Google Summer of Code 2016. I know this
>> is a bit late, but please try to give your comments and suggestions.
>> My proposal could greatly improve from this. Some questions:
>>
>> 1. Should I include more ways in which it can help windows?
>
> I don't think it is necessary.
>
>> 2. Should I include the function names I intend to convert?
>
> I don't think it is necessary, but if you want, you can take a look at
> some big ones (or perhaps just one big) and explain how you plan to
> convert it (using which C functions or apis).

I try to do it for one big one if there is some time left.

>> 3. Is my timeline (a bit different) going to affect me in any way?
>
> What is important with the timeline is just that it looks realistic.
> So each task should have a realistic amount of time and the order in
> which tasks are listed should be logical.
> I commented below about how I think you could improve your timeline.

Your suggestions seem nice to me. I have thought about changing some
parts. I have described some changes below.

>> Here is a Google doc for my proposal.
>> https://docs.google.com/document/d/1stnDPA5Hs3u0a8sqoWZicTFpCz1wHP9bkifcKY13Ocw/edit?usp=sharing
>>
>> For the people who prefer the text only version :
>>
>> ---
>>
>> Incremental rewrite of Git bisect
>>
>> About Me
>>
>> Basic Information
>>
>>
>> Name   Pranit Bauva
>>
>> University IIT Kharagpur
>>
>> MajorMining Engineering
>>
>> Emailpranit.ba...@gmail.com
>>
>> IRC  pungi-man
>>
>> Blog http://bauva.in
>>
>> Timezone IST (UTC +5:30)
>>
>> Background
>>
>> I am a first year undergraduate in the department of Mining
>> Engineering at Indian Institute of Technology, Kharagpur. I am an open
>> source enthusiast. I am a part of Kharagpur Linux Users Group which is
>> basically a group of open-source enthusiasts. I am quite familiar with
>> C and I have been using shell for some time now and still find new
>> things about it everyday. I have used SVN when I was on Windows and
>> then I switched to Git when I moved to linux. Git seems like magic. I
>> always wanted to involve in the development process and Google Summer
>> of Code is an a awesome way to achieve it.
>>
>>
>> Abstract
>>
>> Git bisect is a frequently used command which helps the developers in
>> finding the commit which introduced the bug. Some part of it is
>> written in shell script. I intend to convert it to low level C code
>> thus making them builtins. This will increase Git’s portability.
>> Efficiency of git bisect will definitely increase but it would not
>> really matter much as most of the time is consumed in compiling or
>> testing when in bisection mode but it will definitely reduce the
>> overhead IO which can make the heavy process of compiling relatively
>> lighter.
>>
>>
>> Problems Shell creates
>>
>> System Dependencies
>>
>> Using shell code introduces various dependencies even though they
>> allowing prototyping of the code quickly. Shell script often use some
>> POSIX utilities like cat, grep, ls, mkdir, etc which are not included
>> in non-POSIX systems by default. These scripts do not have access to
>> the git’s internal low level API. So even trivial tasks have to be
>> performed by spawning new process every time. So when git is ported to
>> windows, it has to include all the utilities (namely a shell
>> interpreter, perl bindings and much more).
>>
>> Scripts introduce extra overheads
>>
>> Shell scripts do not have access to Git’s internal API which has
>> excellent use of cache thus reducing the unnecessary IO of user
>> configuration files, repository index and filesystem access. By using
>> a builtin we could exploit the cache system thus reducing the
>> overhead. As compiling / testing already involves quite a number of
>> resources, it would be good if we could do our best to make more
>> resources available for that.
>>
>> Potential Problems
>>
>> Rewriting may introduce bugs
>>
>> Rewriting the shell script to C might introduce some bugs. This
>> problem will be properly taken care of in my method of approach
>> (described below). Still this approach will definitely not guarantee
>> that the functionality of the new will be exactly similar to the old
>> one, though it will greatly reduce its possibility. The reviews
>> provided by the seniors in the git community would help a lot in
>> reducing bugs since they know the common bugs and how to work around
>> them. The test suite of git is quite nice which has an awesome
>> coverage.
>>
>> Rewritten can be hard to understand
>>
>> Git does not like having many external dependencies, libraries or
>> executables other than what is provided by git itself and the
>> rewritten code should follow this. C 

Re: GSoC 2016 | Proposal | Incremental Rewrite of git bisect

2016-03-25 Thread Matthieu Moy
Christian Couder  writes:

> On Thu, Mar 24, 2016 at 12:27 AM, Pranit Bauva  wrote:
>
>> Unification of bisect.c and bisect--helper.c
>>
>> This will unify the algorithmic and non-algorithmic parts of bisect
>> bringing them under one heading to make the code clean.
>
> I am not sure this is needed and a good idea. Maybe you will rename
> "builtin/bisect--helper.c" to "builtin/bisect.c" and remove
> git-bisect.sh at the same time to complete the shell to C move. But
> the actual bisect.{c,h} might be useful as they are for other
> purposes.

Yes. My view on this is that builtin/*.c should be just user-interface,
and actual stuff should be outside builtin, ideally in a well-designed
and reusable library (typically re-usable by libgit2 or others to
provide another UI for the same feature). Not all commands work this
way, but I think this is a good direction to take.

> When you have sent one patch series, even a small one, then your main
> goal should be to have this patch series merged.

I'd add: to get a patch series merged, two things take time:

1) latency: let time to other people to read and comment on your code.

2) extra-work required by reviewers.

You want to send series early because of 1) (then you can work on the
next series while waiting for reviews on the current one), and you need
to prioritize 2) over working on the next series to minimize in-flight
topics.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GSoC 2016 | Proposal | Incremental Rewrite of git bisect

2016-03-25 Thread Christian Couder
On Thu, Mar 24, 2016 at 12:27 AM, Pranit Bauva  wrote:
> Hey!
>
> I have prepared a proposal for Google Summer of Code 2016. I know this
> is a bit late, but please try to give your comments and suggestions.
> My proposal could greatly improve from this. Some questions:
>
> 1. Should I include more ways in which it can help windows?

I don't think it is necessary.

> 2. Should I include the function names I intend to convert?

I don't think it is necessary, but if you want, you can take a look at
some big ones (or perhaps just one big) and explain how you plan to
convert it (using which C functions or apis).

> 3. Is my timeline (a bit different) going to affect me in any way?

What is important with the timeline is just that it looks realistic.
So each task should have a realistic amount of time and the order in
which tasks are listed should be logical.
I commented below about how I think you could improve your timeline.

> Here is a Google doc for my proposal.
> https://docs.google.com/document/d/1stnDPA5Hs3u0a8sqoWZicTFpCz1wHP9bkifcKY13Ocw/edit?usp=sharing
>
> For the people who prefer the text only version :
>
> ---
>
> Incremental rewrite of Git bisect
>
> About Me
>
> Basic Information
>
>
> Name   Pranit Bauva
>
> University IIT Kharagpur
>
> MajorMining Engineering
>
> Emailpranit.ba...@gmail.com
>
> IRC  pungi-man
>
> Blog http://bauva.in
>
> Timezone IST (UTC +5:30)
>
> Background
>
> I am a first year undergraduate in the department of Mining
> Engineering at Indian Institute of Technology, Kharagpur. I am an open
> source enthusiast. I am a part of Kharagpur Linux Users Group which is
> basically a group of open-source enthusiasts. I am quite familiar with
> C and I have been using shell for some time now and still find new
> things about it everyday. I have used SVN when I was on Windows and
> then I switched to Git when I moved to linux. Git seems like magic. I
> always wanted to involve in the development process and Google Summer
> of Code is an a awesome way to achieve it.
>
>
> Abstract
>
> Git bisect is a frequently used command which helps the developers in
> finding the commit which introduced the bug. Some part of it is
> written in shell script. I intend to convert it to low level C code
> thus making them builtins. This will increase Git’s portability.
> Efficiency of git bisect will definitely increase but it would not
> really matter much as most of the time is consumed in compiling or
> testing when in bisection mode but it will definitely reduce the
> overhead IO which can make the heavy process of compiling relatively
> lighter.
>
>
> Problems Shell creates
>
> System Dependencies
>
> Using shell code introduces various dependencies even though they
> allowing prototyping of the code quickly. Shell script often use some
> POSIX utilities like cat, grep, ls, mkdir, etc which are not included
> in non-POSIX systems by default. These scripts do not have access to
> the git’s internal low level API. So even trivial tasks have to be
> performed by spawning new process every time. So when git is ported to
> windows, it has to include all the utilities (namely a shell
> interpreter, perl bindings and much more).
>
> Scripts introduce extra overheads
>
> Shell scripts do not have access to Git’s internal API which has
> excellent use of cache thus reducing the unnecessary IO of user
> configuration files, repository index and filesystem access. By using
> a builtin we could exploit the cache system thus reducing the
> overhead. As compiling / testing already involves quite a number of
> resources, it would be good if we could do our best to make more
> resources available for that.
>
> Potential Problems
>
> Rewriting may introduce bugs
>
> Rewriting the shell script to C might introduce some bugs. This
> problem will be properly taken care of in my method of approach
> (described below). Still this approach will definitely not guarantee
> that the functionality of the new will be exactly similar to the old
> one, though it will greatly reduce its possibility. The reviews
> provided by the seniors in the git community would help a lot in
> reducing bugs since they know the common bugs and how to work around
> them. The test suite of git is quite nice which has an awesome
> coverage.
>
> Rewritten can be hard to understand
>
> Git does not like having many external dependencies, libraries or
> executables other than what is provided by git itself and the
> rewritten code should follow this. C does not provide with a lot of
> other facilities like text processing which shell does whose C
> implementation often spans to multiple lines. C is also notorious for
> being a bit “cryptic”. This problem can be compensated by having well
> written documentation with well defined inputs, outputs and behavior.
>
> A peek into git bisect
>
> How does it help

GSoC 2016 | Proposal | Incremental Rewrite of git bisect

2016-03-23 Thread Pranit Bauva
Hey!

I have prepared a proposal for Google Summer of Code 2016. I know this
is a bit late, but please try to give your comments and suggestions.
My proposal could greatly improve from this. Some questions:

1. Should I include more ways in which it can help windows?
2. Should I include the function names I intend to convert?
3. Is my timeline (a bit different) going to affect me in any way?

Here is a Google doc for my proposal.
https://docs.google.com/document/d/1stnDPA5Hs3u0a8sqoWZicTFpCz1wHP9bkifcKY13Ocw/edit?usp=sharing

For the people who prefer the text only version :

---

Incremental rewrite of Git bisect

About Me

Basic Information


Name   Pranit Bauva

University IIT Kharagpur

MajorMining Engineering

Emailpranit.ba...@gmail.com

IRC  pungi-man

Blog http://bauva.in

Timezone IST (UTC +5:30)

Background

I am a first year undergraduate in the department of Mining
Engineering at Indian Institute of Technology, Kharagpur. I am an open
source enthusiast. I am a part of Kharagpur Linux Users Group which is
basically a group of open-source enthusiasts. I am quite familiar with
C and I have been using shell for some time now and still find new
things about it everyday. I have used SVN when I was on Windows and
then I switched to Git when I moved to linux. Git seems like magic. I
always wanted to involve in the development process and Google Summer
of Code is an a awesome way to achieve it.


Abstract

Git bisect is a frequently used command which helps the developers in
finding the commit which introduced the bug. Some part of it is
written in shell script. I intend to convert it to low level C code
thus making them builtins. This will increase Git’s portability.
Efficiency of git bisect will definitely increase but it would not
really matter much as most of the time is consumed in compiling or
testing when in bisection mode but it will definitely reduce the
overhead IO which can make the heavy process of compiling relatively
lighter.


Problems Shell creates

System Dependencies

Using shell code introduces various dependencies even though they
allowing prototyping of the code quickly. Shell script often use some
POSIX utilities like cat, grep, ls, mkdir, etc which are not included
in non-POSIX systems by default. These scripts do not have access to
the git’s internal low level API. So even trivial tasks have to be
performed by spawning new process every time. So when git is ported to
windows, it has to include all the utilities (namely a shell
interpreter, perl bindings and much more).

Scripts introduce extra overheads

Shell scripts do not have access to Git’s internal API which has
excellent use of cache thus reducing the unnecessary IO of user
configuration files, repository index and filesystem access. By using
a builtin we could exploit the cache system thus reducing the
overhead. As compiling / testing already involves quite a number of
resources, it would be good if we could do our best to make more
resources available for that.

Potential Problems

Rewriting may introduce bugs

Rewriting the shell script to C might introduce some bugs. This
problem will be properly taken care of in my method of approach
(described below). Still this approach will definitely not guarantee
that the functionality of the new will be exactly similar to the old
one, though it will greatly reduce its possibility. The reviews
provided by the seniors in the git community would help a lot in
reducing bugs since they know the common bugs and how to work around
them. The test suite of git is quite nice which has an awesome
coverage.

Rewritten can be hard to understand

Git does not like having many external dependencies, libraries or
executables other than what is provided by git itself and the
rewritten code should follow this. C does not provide with a lot of
other facilities like text processing which shell does whose C
implementation often spans to multiple lines. C is also notorious for
being a bit “cryptic”. This problem can be compensated by having well
written documentation with well defined inputs, outputs and behavior.

A peek into git bisect

How does it help?

Git bisect helps the software developers to find the commit that
introduced a regression. Software developers are interested in knowing
this because a commit changes a small set of code (most time).  It is
much easier to understand and fix a problem when you know only need to
check a very small set of changes, than when you don’t know where to
look at it. It is not that the problem will be exactly in that commit
but it will be related to the behavior introduced in the commit.
Software bugs can be a nightmare when the code base is very large.
There would be a lot of sleepless night in figuring out the part which
causes the error. This is where git bisect helps. This is the one of
the most sought after tool