nervoussidd commented on issue #24903: URL: https://github.com/apache/beam/issues/24903#issuecomment-1463440208
Hey Anand, I am attaching a file which contains the script that train the housing data and saves the trained models to GCS i dont have an GCS account so i hve just put in the code that will save the model to Gcs location.This file just contains the code and reflect the things that it will do. I have limitations in my system which i am trying to fix to run this in a pipeline.In the moment just want your feedback on the code which will help me to fix the issue.And thank you for tremendous support throughout and patience. On Tue, Feb 28, 2023 at 10:20 PM Anand Inguva ***@***.***> wrote: > Hey Anand, Sorry but because of my other stuff (my mid sem exams,tutoring > and GSOC preparation)i was not able to focus on this project.To inform you > about my progress i have already learnt about the respective things to > create a script.i need some more days to finish it off.And again i am sorry > for not updating you with my progress previously.Hope you understand. > > Hi, please take your time and work when you can on Beam. I am here to help > you in anyway I can. Thanks for the update, > > — > Reply to this email directly, view it on GitHub > <https://github.com/apache/beam/issues/24903#issuecomment-1448514686>, or > unsubscribe > <https://github.com/notifications/unsubscribe-auth/AUA2XLWHYXOU7JDI4ZFTJRTWZYUGPANCNFSM6AAAAAATSCNU4Y> > . > You are receiving this because you commented.Message ID: > ***@***.***> > import pickle import pandas as pd from sklearn.linear_model import LinearRegression from google.cloud import storage BUCKET_NAME = "your-bucket-name" MODEL_NAME = "linear_regression_model.pkl" TRAINING_DATA = "gs://{}/data/housing_train.csv".format(BUCKET_NAME) # loading the training data housing_data = pd.read_csv(TRAINING_DATA) # separating the features and target variable X_train = housing_data.drop("median_house_value", axis=1) y_train = housing_data["median_house_value"].values # train the model model = LinearRegression() model.fit(X_train, y_train) # save the trained model to GCS model_file = pickle.dumps(model) storage_client = storage.Client() bucket = storage_client.bucket(BUCKET_NAME) blob = bucket.blob(MODEL_NAME) blob.upload_from_string(model_file) print("Trained model saved to GCS path: gs://{}/{}".format(BUCKET_NAME, MODEL_NAME)) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
