I don't see an unresolved reference to xrange, but I do see that it can't import sklearn.
Did you built scikit-learn?
See:
http://scikit-learn.org/dev/developers/contributing.html#retrieving-the-latest-code\

Either do

make
or
python setup.py build_ext -i
or
python setup.py develop
or
pip install . -e

(which all do slightly different things)

I'd probably go with the first if you have another installation of scikit-learn on your machine
and the last if you want to make that your primary installation.

Cheers,
Andy

On 06/15/2016 01:01 AM, Basil Beirouti wrote:
Hello Pavel and Joel,

I forked the repository and cloned it on my machine. I'm using pycharm on a Mac, and while looking at text.py, I'm getting an unresolved reference for "xrange" at line 28:

from ..externals.six.movesimport range
Pycharm says Function 'six.py' is too large to analyze, so I'm not sure if this error is somehow related to that. I decided to try to build the code as a sanity check but I can't find any reliable instructions as to how to do that. Naively, I opened terminal and cd to the directory above "scikit-learn" folder (where I had cloned my fork) and tried to run:

$ python3 setup.py install

Which didn't work. I got this error:

ImportError: No module named 'sklearn'

Can someone point me in the right direction? And how can the code try to import sklearn if it doesn't exist yet? Note I haven't installed the release version of scikit-learn using pip or any other tool, but I should be able to bootstrap it from the source code, right?

Here's the full error message if it helps. Forgive me if it's a silly mistake, but I haven't found any reliable guidelines online.

  File "setup.py", line 84, in <module>

    from numpy.distutils.core import setup

File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/distutils/core.py", line 26, in <module>

    from numpy.distutils.command import config, config_compiler, \

File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/distutils/command/build_ext.py", line 18, in <module>

    from numpy.distutils.system_info import combine_paths

File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/distutils/system_info.py", line 232, in <module>

    triplet = str(p.communicate()[0].decode().strip())

File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 791, in communicate

    stdout = _eintr_retry_call(self.stdout.read)

File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 476, in _eintr_retry_call

    return func(*args)

KeyboardInterrupt

Basils-MacBook-Pro:sklearn basilbeirouti$ python3 setup.py install

non-existing path in '__check_build': '_check_build.c'

Appending sklearn.__check_build configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.__check_build')

Appending sklearn._build_utils configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn._build_utils')

Appending sklearn.covariance configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.covariance')

Appending sklearn.covariance/tests configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.covariance/tests')

Appending sklearn.cross_decomposition configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.cross_decomposition')

Appending sklearn.cross_decomposition/tests configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.cross_decomposition/tests')

Appending sklearn.feature_selection configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.feature_selection')

Appending sklearn.feature_selection/tests configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.feature_selection/tests')

Appending sklearn.gaussian_process configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.gaussian_process')

Appending sklearn.gaussian_process/tests configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.gaussian_process/tests')

Appending sklearn.mixture configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.mixture')

Appending sklearn.mixture/tests configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.mixture/tests')

Appending sklearn.model_selection configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.model_selection')

Appending sklearn.model_selection/tests configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.model_selection/tests')

Appending sklearn.neural_network configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.neural_network')

Appending sklearn.neural_network/tests configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.neural_network/tests')

Appending sklearn.preprocessing configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.preprocessing')

Appending sklearn.preprocessing/tests configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.preprocessing/tests')

Appending sklearn.semi_supervised configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.semi_supervised')

Appending sklearn.semi_supervised/tests configuration to sklearn

Ignoring attempt to set 'name' (from 'sklearn' to 'sklearn.semi_supervised/tests')

Warning: Assuming default configuration (./_build_utils/{setup__build_utils,setup}.py was not found)Warning: Assuming default configuration (./covariance/{setup_covariance,setup}.py was not found)Warning: Assuming default configuration (./covariance/tests/setup_covariance/{setup_covariance/tests,setup}.py was not found)Warning: Assuming default configuration (./cross_decomposition/{setup_cross_decomposition,setup}.py was not found)Warning: Assuming default configuration (./cross_decomposition/tests/setup_cross_decomposition/{setup_cross_decomposition/tests,setup}.py was not found)Warning: Assuming default configuration (./feature_selection/{setup_feature_selection,setup}.py was not found)Warning: Assuming default configuration (./feature_selection/tests/setup_feature_selection/{setup_feature_selection/tests,setup}.py was not found)Warning: Assuming default configuration (./gaussian_process/{setup_gaussian_process,setup}.py was not found)Warning: Assuming default configuration (./gaussian_process/tests/setup_gaussian_process/{setup_gaussian_process/tests,setup}.py was not found)Warning: Assuming default configuration (./mixture/{setup_mixture,setup}.py was not found)Warning: Assuming default configuration (./mixture/tests/setup_mixture/{setup_mixture/tests,setup}.py was not found)Warning: Assuming default configuration (./model_selection/{setup_model_selection,setup}.py was not found)Warning: Assuming default configuration (./model_selection/tests/setup_model_selection/{setup_model_selection/tests,setup}.py was not found)Warning: Assuming default configuration (./neural_network/{setup_neural_network,setup}.py was not found)Warning: Assuming default configuration (./neural_network/tests/setup_neural_network/{setup_neural_network/tests,setup}.py was not found)Warning: Assuming default configuration (./preprocessing/{setup_preprocessing,setup}.py was not found)Warning: Assuming default configuration (./preprocessing/tests/setup_preprocessing/{setup_preprocessing/tests,setup}.py was not found)Warning: Assuming default configuration (./semi_supervised/{setup_semi_supervised,setup}.py was not found)Warning: Assuming default configuration (./semi_supervised/tests/setup_semi_supervised/{setup_semi_supervised/tests,setup}.py was not found)Traceback (most recent call last):

  File "setup.py", line 85, in <module>

    setup(**configuration(top_path='').todict())

  File "setup.py", line 44, in configuration

    config.add_subpackage('cluster')

File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/numpy/distutils/misc_util.py", line 1003, in add_subpackage

    caller_level = 2)

File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/numpy/distutils/misc_util.py", line 972, in get_subpackage

    caller_level = caller_level + 1)

File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/numpy/distutils/misc_util.py", line 884, in _get_configuration_from_setup_py

    ('.py', 'U', 1))

File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/imp.py", line 234, in load_module

    return load_source(name, filename, file)

File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/imp.py", line 172, in load_source

    module = _load(spec)

  File "<frozen importlib._bootstrap>", line 693, in _load

  File "<frozen importlib._bootstrap>", line 673, in _load_unlocked

  File "<frozen importlib._bootstrap_external>", line 662, in exec_module

File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed

  File "./cluster/setup.py", line 8, in <module>

    from sklearn._build_utils import get_blas_info

ImportError: No module named 'sklearn'

On Tue, Jun 14, 2016 at 11:41 AM, <[email protected] <mailto:[email protected]>> wrote:

    Send scikit-learn mailing list submissions to
    [email protected] <mailto:[email protected]>

    To subscribe or unsubscribe via the World Wide Web, visit
    https://mail.python.org/mailman/listinfo/scikit-learn
    or, via email, send a message with subject or body 'help' to
    [email protected]
    <mailto:[email protected]>

    You can reach the person managing the list at
    [email protected] <mailto:[email protected]>

    When replying, please edit your Subject line so it is more specific
    than "Re: Contents of scikit-learn digest..."


    Today's Topics:

       1. Re: Adding BM25 relevance function (Pavel Soriano)
       2. Re: The culture of commit squashing (Andreas Mueller)
       3. Re: The culture of commit squashing (Tom DLT)


    ----------------------------------------------------------------------

    Message: 1
    Date: Tue, 14 Jun 2016 16:11:10 +0000
    From: Pavel Soriano <[email protected]
    <mailto:[email protected]>>
    To: Scikit-learn user and developer mailing list
            <[email protected] <mailto:[email protected]>>
    Subject: Re: [scikit-learn] Adding BM25 relevance function
    Message-ID:
<can0wwk93r2aw9no65cgicw5hqg7-ofyvzamjqpxpegtxmsq...@mail.gmail.com <mailto:can0wwk93r2aw9no65cgicw5hqg7-ofyvzamjqpxpegtxmsq...@mail.gmail.com>>
    Content-Type: text/plain; charset="utf-8"

    Hey,

    Good thing that you are trying to finish this.

    Well, I looked into my old notes, and the Delta tf-idf comes from
    the "Delta
    TFIDF: An Improved Feature Space for Sentiment Analysis"
    <http://ebiquity.umbc.edu/_file_directory_/papers/446.pdf> paper.
    I guess
    it is not very popular and apparently it has a drawback: it does
    not take
    into account the number of times a word occurs in each document while
    calculating the distribution amongst classes. At least that is
    what I wrote
    on my notes...

    As for the delta idf... If it helps, I can look into my old code
    cause I do
    not know what I was talking about. I guess it has to do somehow
    with the
    paper cited before.

    Cheers,

    Pavel Soriano




    On Tue, Jun 14, 2016 at 5:49 PM Basil Beirouti
    <[email protected] <mailto:[email protected]>>
    wrote:

    > Hi Joel,
    >
    > Thanks for your response and for digging up that archived
    thread, it gives
    > me a lot of clarity.
    >
    > I see your point about BM25, but I think in most cases where
    TFIDF makes
    > sense, BM25 makes sense as well, but it could be "overkill".
    >
    > Consider that TFIDF does not produce normalized results either
    >
    
<http://scikit-learn.org/stable/auto_examples/text/document_clustering.html#example-text-document-clustering-py>,
    > If BM25 requires dimensionality reduction (eg. using LSA) , so
    too would
    > TFIDF. The term-document matrix is the same size no matter which
    weighting
    > scheme is used. The only difference is that BM25 produces better
    results
    > when the corpus is large enough that the term frequency in a
    document, and
    > the document frequency in the corpus, can vary considerably
    across a broad
    > range of values.Maybe you could even say TFIDF and BM25 are the same
    > equation except BM25 has a few additional hyperparameters (b and k).
    >
    > So is the advantage that BM25 provides for large diverse corpora
    with it?
    > or is it marginal? Perhaps you can point me to some more
    examples where
    > TFIDF is used (in supervised setting preferably) and I can plug
    in BM25 in
    > place of TFIDF and see how it compares. Here are some I found:
    >
    >
    >
    
http://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html
    > *(supervised)*
    >
    >
    
http://scikit-learn.org/stable/auto_examples/text/document_clustering.html#example-text-document-clustering-py
    > (*unsupervised)*
    >
    > Thank you!
    > Basil
    >
    > PS: By the way, I'm not familiar with the delta-idf transform
    that Pavel
    > mentions in the archive you linked, I'll have to delve deeper
    into that. I
    > agree with the response to Pavel that he should be putting it in
    a separate
    > class, not adding on to the TFIDF. I think it would take me
    about 6-8 weeks
    > to adapt my code to the fit transform model and submit a pull
    request.
    >
    >
    >
    >
    >
    >
    > _______________________________________________
    > scikit-learn mailing list
    > [email protected] <mailto:[email protected]>
    > https://mail.python.org/mailman/listinfo/scikit-learn
    >
    --
    Pavel SORIANO

    PhD Student
    ERIC Laboratory
    Universit? de Lyon
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL:
    
<http://mail.python.org/pipermail/scikit-learn/attachments/20160614/cbe49979/attachment-0001.html>

    ------------------------------

    Message: 2
    Date: Tue, 14 Jun 2016 12:13:29 -0400
    From: Andreas Mueller <[email protected] <mailto:[email protected]>>
    To: Scikit-learn user and developer mailing list
            <[email protected] <mailto:[email protected]>>
    Subject: Re: [scikit-learn] The culture of commit squashing
    Message-ID: <[email protected]
    <mailto:[email protected]>>
    Content-Type: text/plain; charset="windows-1252"; Format="flowed"

    I'm +1 for using the button when appropriate.
    I think it should be up to the merging person to make a call whether a
    squash is a better
    logical unit than all the commits.
    I would set like a soft limit at ~5 commits or something. If your
    PR has
    more than 5 separate
    big logical units, it's probably too big.

    The button is enabled in the settings but I can't see it.
    Am I being stupid?

    On 06/14/2016 06:58 AM, Joel Nothman wrote:
    > Sounds good to me. Thank goodness someone reads the documentation!
    >
    > On 14 June 2016 at 19:51, Alexandre Gramfort
    > <[email protected]
    <mailto:[email protected]>
    > <mailto:[email protected]
    <mailto:[email protected]>>> wrote:
    >
    >     > We could stop squashing during development, and use the
    new Squash-and-Merge
    >     > button on GitHub.
    >     > What do you think?
    >
    >     +1
    >
    >     the reason I see for squashing during dev is to avoid
    killing the
    >     browser when reviewing. It really rarely happens though.
    >
    >     A
    >  _______________________________________________
    >     scikit-learn mailing list
    > [email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>
    > https://mail.python.org/mailman/listinfo/scikit-learn
    >
    >
    >
    >
    > _______________________________________________
    > scikit-learn mailing list
    > [email protected] <mailto:[email protected]>
    > https://mail.python.org/mailman/listinfo/scikit-learn

    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL:
    
<http://mail.python.org/pipermail/scikit-learn/attachments/20160614/135d4c27/attachment-0001.html>

    ------------------------------

    Message: 3
    Date: Tue, 14 Jun 2016 18:40:39 +0200
    From: Tom DLT <[email protected]
    <mailto:[email protected]>>
    To: Scikit-learn user and developer mailing list
            <[email protected] <mailto:[email protected]>>
    Subject: Re: [scikit-learn] The culture of commit squashing
    Message-ID:
<CAGKmC=sRMbwo1Pjm=ph3r6oqsmvzuzdbmjvj09yjwkk0+yq...@mail.gmail.com <mailto:ph3r6oqsmvzuzdbmjvj09yjwkk0%[email protected]>>
    Content-Type: text/plain; charset="utf-8"

    @Andreas
    It's a bit hidden: You need to click on "Merge pull-request", then
    do *not*
    click on "Confirm merge", but on the small arrow to the right, and
    select
    "Squash and merge".

    2016-06-14 18:13 GMT+02:00 Andreas Mueller <[email protected]
    <mailto:[email protected]>>:

    > I'm +1 for using the button when appropriate.
    > I think it should be up to the merging person to make a call
    whether a
    > squash is a better
    > logical unit than all the commits.
    > I would set like a soft limit at ~5 commits or something. If
    your PR has
    > more than 5 separate
    > big logical units, it's probably too big.
    >
    > The button is enabled in the settings but I can't see it.
    > Am I being stupid?
    >
    >
    > On 06/14/2016 06:58 AM, Joel Nothman wrote:
    >
    > Sounds good to me. Thank goodness someone reads the documentation!
    >
    > On 14 June 2016 at 19:51, Alexandre Gramfort <
    > [email protected]
    <mailto:[email protected]>> wrote:
    >
    >> > We could stop squashing during development, and use the new
    >> Squash-and-Merge
    >> > button on GitHub.
    >> > What do you think?
    >>
    >> +1
    >>
    >> the reason I see for squashing during dev is to avoid killing the
    >> browser when reviewing. It really rarely happens though.
    >>
    >> A
    >> _______________________________________________
    >> scikit-learn mailing list
    >> [email protected] <mailto:[email protected]>
    >> https://mail.python.org/mailman/listinfo/scikit-learn
    >>
    >
    >
    >
    > _______________________________________________
    > scikit-learn mailing
    
[email protected]https://mail.python.org/mailman/listinfo/scikit-learn
    <http://mail.python.org/mailman/listinfo/scikit-learn>
    >
    >
    >
    > _______________________________________________
    > scikit-learn mailing list
    > [email protected] <mailto:[email protected]>
    > https://mail.python.org/mailman/listinfo/scikit-learn
    >
    >
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL:
    
<http://mail.python.org/pipermail/scikit-learn/attachments/20160614/511d2a1d/attachment.html>

    ------------------------------

    Subject: Digest Footer

    _______________________________________________
    scikit-learn mailing list
    [email protected] <mailto:[email protected]>
    https://mail.python.org/mailman/listinfo/scikit-learn


    ------------------------------

    End of scikit-learn Digest, Vol 3, Issue 27
    *******************************************




_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to